Closed daviesrob closed 1 year ago
"HTSLIB_1.17" : Using an updated
htslib.map
with the new symbols in anHTSLIB_1.17
section. This is what the next release should look like. Built against ↓ / runtime linked to → unversioned HTSLIB_1.16 HTSLIB_1.17 unversioned works works works HTSLIB_1.16 works works works HTSLIB_1.17 works fails worksSo
samtools
build against an unversioned library will work with anything, and an unversioned library will work withsamtools
even if it was built against a versioned one.samtools
built against theHTSLIB_1.16
library will still work even if the library is upgraded toHTSLIB_1.17
. Butsamtools
expectingHTSLIB_1.17
will complain about "version 'HTSLIB_1.17' not found" and refuse to run if given theHTSLIB_1.16
library (even though the symbols needed are present in unversioned form).
I'm genuinely curious on this one. What's actually going on here?
If current samtools can build, link and uses htslib 1.16, then it clearly doesn't need any symbols from 1.17. So why if we build it against 1.17 and then replace the shared library with a 1.16 one does it die?
Is it due to some inline functions used by samtools that have changed, that now refer to htslib 1.17 symbols only?
Naively what I'd expect is samtools when built and linked against a newer versions would be specifying all the symbols it uses and their version numbers, and as long as those versions are present in whatever library we use at run time then it should work.
That's because ld.so
doesn't resolve all symbols at start-up (unless you set LD_BIND_NOW
). So when versioned symbols are present, it doesn't bother to check all the functions to see if they're around. Instead it does a quick check to ensure that all the version names it expects are present, and fails if any of them are missing.
For clarity incase others didn't read the above very carefully, when the table states samtools "built against HTSLIB_1.16", it doesn't mean built against the released htslib 1.16 tarball or package install. Instead it means samtools develop built against htslib develop (pre-1.17) with a 1.16 symbol version file.
Hence my confusion as to why the versioning wasn't apparently helping with binary compatibility.
This does not seem to work with bcftools. Is that expected behaviour?
It should work. What went wrong?
I tried a 117 version of bcftools and it worked quite happily on the 116 version of htslib instead of throwing an error.
I think that's expected to happen, as the 1.16 version didn't have any symbol versioning.
To verify this is working as intended what we ought to do is create a local version of 1.15, 1.16, 1.17 built with their appropriate symbol table files. This gives us something to experiment with to verify how things will work in the future. I'm not proposing we go back and rerelease things, but just to use this as a test bed to verify it does what we want it to.
Edit: although I wouldn't be suprised if someone like Debian did take the symbol version file and retrofit it to older LTS releases, as once the leg work has been done it's quite easy for them to add it as a patch.
I was trying to reproduce the results in the table of the original comment but with bcftools rather than samtools.
I think bcftools
develop works because it hasn't started to use any of the symbols introduced since 1.16 yet. If I try it built against my HTSLIB_1.17 library, I find it referenced symbols from these versions:
nm -D -g --with-symbol-versions bcftools | grep HTSLIB | sed 's/.*@//' | sort | uniq -c
139 HTSLIB_1.0
3 HTSLIB_1.1
2 HTSLIB_1.10
1 HTSLIB_1.13
3 HTSLIB_1.16
4 HTSLIB_1.2.1
5 HTSLIB_1.3
21 HTSLIB_1.4
1 HTSLIB_1.5
3 HTSLIB_1.6
1 HTSLIB_1.7
So no HTSLIB_1.17 in there. This isn't too surprising as most of the new symbols in HTSlib and bam and cram-related. The ones bcftools might be most interested in are bcf_strerror()
, and the faidx interfaces fai_adjust_region()
, faidx_seq_len64()
, and fai_line_length()
. None of them have made it into its code-base yet.
Thanks, that would explain it.
You used -Wl,-version-script
, but this may need to be -Wl,--version-script
—that's how GNU ld documents it.
Hm, so it does. Although -version-script
evidently works.
The single-hyphen ‑version-script
also appears in my hacky draft over on the original issue.
I don't recall whether this was a typo or transcribed from one of the other projects I was looking at or influenced by ‑shared
and ‑soname
on the same link command line (which are documented as being spelled with one hyphen, presumably for compatibility with older vendor linkers).
The same GNU ld documentation page also says
For options whose names are multiple letters, either one dash or two can precede the option name; for example, ‘-trace-symbol’ and ‘--trace-symbol’ are equivalent. [etc]
Looking at the source code (see ONE_DASH
/TWO_DASHES
/EXACTLY_TWO_DASHES
in ld/lexsup.c) this has been like this for many years.
Likewise the gold
source code documentation says it accepts either.
Are there any other linkers that accept a similar version script option? If so, they might provide a ruling. Otherwise… ¯\_(ツ)_/¯
The first commit with the
htslib.map
file is entirely based on the content in this comment from @jmarshall, so I've taken the liberty of making him the author of it.I've also added
configure
support, so people who dislike versioned symbols can turn them off using./configure --disable-versioned-symbols
. The option name was copied from the same feature incurl
's configure script.There's also a rudimentary Makefile rule to update the
htslib.map
file with any new exported symbols. It includes checks to ensure that HTSlib's version number has been updated before adding anything. If you want to force it to do something for testing, it's possible by overriding thePACKAGE_VERSION
variable, e.g.:Note that the
htslib.map
file should only be updated on package release, otherwise you risk breaking compatibility for programs that link against the library when the latter gets updated.To see how symbol versions affect linking, it's possible to make multiple copies of the shared library with different map files installed in separate directories. Then you can try building e.g.
samtools
against each version usingconfigure --with-htslib=/tmp/htslib_1.16
etc., and use LD_LIBRARY_PATH to force it to use the different versions of the shared library to see what happens. I've summarised the results for the current samtools and HTSlibdevelop
branches below. The three versions I used were:"unversioned" : Symbol versioning turned off using
./configure --disable-versioned-symbols
"HTSLIB_1.16" : The
htslib.map
file in this PR. New symbols added since 1.16 are unversioned."HTSLIB_1.17" : Using an updated
htslib.map
with the new symbols in anHTSLIB_1.17
section. This is what the next release should look like.So
samtools
build against an unversioned library will work with anything, and an unversioned library will work withsamtools
even if it was built against a versioned one.samtools
built against theHTSLIB_1.16
library will still work even if the library is upgraded toHTSLIB_1.17
. Butsamtools
expectingHTSLIB_1.17
will complain about "version 'HTSLIB_1.17' not found" and refuse to run if given theHTSLIB_1.16
library (even though the symbols needed are present in unversioned form).Fixes #1505