Open dkg opened 2 years ago
Thanks for the detailed write-up!
I agree it’s going to be weird and complex to solve as things are.
some thoughts in no particular order:
given the above, it may not be safe to rely on SONAME at all for xdotool until development practices are changed (and consistent in) whatever promises about ABI and/or API compatibility. If nothing about development changes, then we will be forever playing whack-a-mole with these problems, I think.
A small solution might be to require exact version matching of xdotool version and libxdo. To support this, I could add an ABI/API safe way of checking for this mismatch and providing an actionable error message when a problem is detected (that is, “xdotool version X.Y is not compatible with libxdo version X.Y-1”). Glibc does this and honestly I kinda hate it, though. For downstream packagers like Debian, this would mean pinning the exact version so that one couldn’t upgrade xdotool without also upgrading libxdo, as an example.
If the above isn’t feeling like a good solution, I’m open to exploring other options.
If we want ABI and API stability, or at least making accidental breakages more difficult, then safety could be found with additions to the test suite or at least my local dev/release cycle to catch these kinds of problems.
Another option would be to follow chrome and Firefox and just do major/soname version bumps for every release.
What do you think?
Thanks for thinking through the options with me here, @jordansissel.
Some thoughts on your proposals:
xdotool isn't the only thing that depends on libxdo -- in debian alone, there are three other packages:
And that's of course not even covering private tooling that someone else might have written to use libxdo that they would expect to just keep working across a libxdo upgrade.
So just giving up on SONAME compatibility and resorting to some kind of runtime check would require more coordination than just tight bindings between xdotool and libxdo.
The ABI changes we've run into in the past have been mainly due to struct changes. i think there was one function removal (xdo_get_keysym_charmap
) which was technically an ABI change, but no one used it so we just ignored it ☺
A regularly incremented SONAME bump (like firefox and chrome versioning) is a pain for packages that depend on your software, because they need to be rebuilt whenever the SONAME changes.
If you want to add some tests to your test suite, you could try to wire up something like the ABI compliance checker to make comparisons between the last git tag and the HEAD or something. I haven't tried to do that myself.
One way to think about this without a lot of work is just to diff xdo.h
from the last release, then, any of the following conditions require an SONAME bump:
charcodemap_t
)SEARCH_ROLE
passed to a pre-existing member)Otherwise, the only reason to need an SONAME bump is if the semantics of a function changes, which is pretty unlikely.
If you add a simple check for unknown flags (SIZE_USEHINTS*
in xdo_set_window_size
and xdo_wait_for_window_size
and SEARCH_*
in xdo_search_windows
) and return a distinct error when that happens, you should be able to handle backward compatibility as well.
Another way to maintain ABI stability going forward would be to hide the structs themselves behind opaque pointers, and have the library initialize them, and offer getter functions for the variables that the user might want to see, and setter functions for a variable that the user might want to actually set. That's probably not possible for charcode_map_t
because of how it's used as
an array, but xdo_t
and xdo_search_t
both seem like they could be done that way.
If you're interested in trying to do anything like the above, i'd be happy to review.
As time permits, I'm still considering how to approach this going forward. It seems like all the stable solutions require at least one more major ABI breakage, and more ABI breakages if mistakes are made during the process.
If you're also concerned about tracking ABI/API, another thing you might consider is asking the author of https://abi-laboratory.pro to track libxdo.
and yes: i agree that doing something to get to a stable state will require at least one more formal ABI change -- an SONAME bump is probably worth doing to get there. and that's ok, i expect that anyone who wants a stable ABI (including me) will be fine with handling an explicit ABI breakage for the sake of minimizing risk of future accidental ABI breakage.
great to see diligent work being done 😁👍
Not so great that I can't use the latest version of xdotool 😆 not even if I decide that the broken ABI is fine in my docker container. It's not available on some opt-in repository somewhere, so I would need to build it from source, am I seeing that right?
For anyone in the same situation: I ended up doing exactly that because I just wanted to use the getwindowclassname
command to check something:
sudo apt-get install -y libx11-dev libxtst-dev libxi-dev libxkbcommon-dev libxinerama-dev
git clone https://github.com/jordansissel/xdotool.git /tmp/xdotool
( cd /tmp/xdotool && make && sudo make install )
rm -rf /tmp/xdotool
xdotool [...]
way back in the day we had a discussion about ABI changes due to a member removal in
struct xdo
. I didn't follow up on it with a proposal to fix it concretely (i.e. by bumping the SONAME), just patched around it in debian to keep the API stable (see also https://bugs.debian.org/764076).I'm now (belatedly) looking at packaging v3.20211022.1 and it looks like there's another ABI change without an SONAME bump -- this due to member addition in
struct xdo_search
. Fromgit diff v3.20160805.1..v3.20211022.1 -- xdo.h
i see:(this corresponds with the additional definition of
SEARCH_ROLE
as the flag to set here)Consider a program built against the old version of
xdo.h
, which deals with anxdo_search_t
objectfoo
and setsfoo.pid = 12345;
before passing a pointer tofoo
intoxdo_search_windows
. The compiler calculates the memory offset forfoo.pid
based on the oldxdo.h
. When the program is dynamically linked against a newer version oflibxdo.3
, that newer library will read the12345
value as though it is achar*
, stored inwinrole
. All the other later members of the struct will be offset as well.This might on its own be fixable by moving the
winrole
member to the end of the struct, and relying on the fact that the user wouldn't have known to setSEARCH_ROLE
infoo.searchmask
. However, even this would still be unsafe, for two reasons:xdo_search_t
-- typically either on the stack or from the heap. The size allocated when building against an oldxdo.h
will be smaller than the size expected a newer library.winrole
member ofxdo_search_t
whenxdo_search_windows
is called. In particular,xdo_search_windows
callscheck_window_match
pointing to that struct. Andcheck_window_match
itself callscompile_re(search->winrole, …)
. This could result in sending unallocated memory (or arbitrary data) as input tocompile_re
. yikes! :/This isn't even getting into the issues around backward API/ABI compatibility -- where someone runs code that was built with a new version of
xdo.h
against an older version of the library.So the fix isn't particularly simple. The simplest thing to do from the library's perspective would probably be to bump the SONAME from 3 to 4. (i think this means releasing a new version 4.xx.1, if i understand the xdotool version numbering scheme)
If you do that, i'll manage the SONAME transition like any other SONAME bump on debian.
If you want to keep SONAME 3, one approach would be to move
winrole
to the end of the struct, and then be very careful to never access it unlessSEARCH_ROLE
is explcitly set. Note that this will be another ABI break to the recent versions oflibxdo
that were already released with the addition ofwinrole
.And, this won't handle the backward-compatibility case very well -- a program built against a newer
xdo.h
that usesxdo_search_windows
while settingSEARCH_ROLE
insearchmask
and populatingwinrole
will have that parameter silently ignored by an olderlibxdo.so.3
(i don't think any version ofxdo_search_windows
has ever checked for unknown flags insearchmask
and rejected the query with any sort of "flag unsupported" error if found).