Bioconductor / Rhtslib

HTSlib high-throughput sequencing library as an R package
https://bioconductor.org/packages/Rhtslib
11 stars 12 forks source link

`R CMD check` warning on the `__sprintf_chk` symbol #31

Closed oleksii-nikolaienko closed 11 months ago

oleksii-nikolaienko commented 1 year ago

Hi, I asked related question on Bioc-devel mailing list, but now did some checks and decided to post this issue. Current Rhtslib's check results at nebbiolo contain the following warning (and similar warnings therefore appear in all packages linking to Rhtslib):

File ‘/home/biocbuild/bbs-3.17-bioc/R/library/Rhtslib/libs/Rhtslib.so’:
  Found ‘__sprintf_chk’, possibly from ‘sprintf’ (C)
  Found ‘abort’, possibly from ‘abort’ (C)
  Found ‘exit’, possibly from ‘exit’ (C)
  Found ‘stderr’, possibly from ‘stderr’ (C)
  Found ‘stdout’, possibly from ‘stdout’ (C)

__sprintf_chk is indeed present in Rhtslib.so:

> system(paste("nm", system.file("libs/Rhtslib.so", package="Rhtslib"), "| grep printf"))
                 U __fprintf_chk@@GLIBC_2.3.4
00000000000779b0 T ksprintf
0000000000077790 T kvsprintf
                 U __snprintf_chk@@GLIBC_2.3.4
                 U snprintf@@GLIBC_2.2.5
                 U __sprintf_chk@@GLIBC_2.3.4
                 U __vfprintf_chk@@GLIBC_2.3.4
                 U __vsnprintf_chk@@GLIBC_2.3.4

However, none of files from include/htslib use sprintf (same is also true for abort). It is only used in some C files from include: bam_cat.c, bam_sort.c, etc — which seem to be a part of samtools but not htslib.

Is the reason for __sprintf_chk warning that non-htslib files located in the include folder use sprintf? Are those files used by anyone? Should those sprintfs be substituted by snprintfs?

Thanks for looking into this.

Oleksii

oleksii-nikolaienko commented 1 year ago

No, sorry, my mistake, it actually looks like there are lots of sprintfs in htslib code. I guess, even if someone substitutes all sprintfs with snprintfs, it wouldn't solve R CMD check warnings because there are multiple aborts and other prohibited functions?..

oleksii-nikolaienko commented 1 year ago

Hi @hpages , Please see the following discussion. Guys from HTSlib were very nice to substitute sprintfs with snprintfs so we could get rid of R CMD check warnings. Would it be possible to update Rhtslib to the most recent version? [Suggested solution for sprintfs in samtools parts of Rhtslib is that we fix them ourselves] Thanks!

hpages commented 1 year ago

Awesome. thanks for doing that. Interesting discussion BTW!

Would it be possible to update Rhtslib to the most recent version?

Possible yes, but not a small endeavor. I've done this kind of update twice in the past already, and each time there were significant changes in the HTSlib API that broke several Bioconductor packages that depend on Rhtslib, so a lot of patching was needed. Last update was from HTSlib 1.7 to 1.15.1 one year ago.

Current HTSlib is 1.17. I don't think it includes the sprintf -> snprintf fix though. We'll probably need to wait for the next HTSlib version for that. It's on my list to update Rhtslib again at some point but I don't have a clear time line for this.

oleksii-nikolaienko commented 1 year ago

Current HTSlib is 1.17. I don't think it includes the sprintf -> snprintf fix though.

Yes, you're right, I guess it will be in the next release.

May involve a lot of patching, true, but hopefully less than before since fewer releases were skipped...

hpages commented 1 year ago

May involve a lot of patching, true, but hopefully less than before since fewer releases were skipped...

That's my hope too!

hpages commented 1 year ago

I just updated Rhtslib to the most recent version of HTSlib (1.18), see commit 7f7b76523860aa0191685afaf7ef0226416b5f59.

This is in BioC 3.19 only (current devel).

Running R CMD check locally on my laptop (Ubuntu Linux) on all the reverse deps didn't indicate any major issue, except for Rsamtools for which I committed a small fix. Anyways, I'll closely monitor the build/check results for Rhtslib and its reverse deps in the following days at https://bioconductor.org/checkResults/3.19/bioc-LATEST/Rhtslib/

Hopefully the update won't cause too much trouble and the annoying R CMD check warning about __sprintf_chk/sprintf, abort, etc... will go away.

oleksii-nikolaienko commented 1 year ago

Hervé, this is amazing, thank you! I actually wanted to politely remind you about this after the 3.18 release, but you were first :) Unfortunately, something else I asked for didn't make it to htslib-1.18 release: the ability to read FASTA with decompression threads. I wonder if it might be of interest to other Bioc packages too...

Edit: oh, it actually did appear in 1.18!

hpages commented 12 months ago

Many "Error during sorting" on today's report for Windows, unfortunately:

(Let's ignore the errors on Mac for now. Most of them are due to a configuration issue on merida1 that should clear out tomorrow.)

This will require an unpleasant troubleshooting session on Windows, I'm afraid :disappointed: I'll try to get to this tomorrow when I have access to our Windows builder palomino3.. sigh

oleksii-nikolaienko commented 12 months ago

Sorry to hear that. Strange that it's on Windows only...

hpages commented 12 months ago

Sorry to hear that. Strange that it's on Windows only...

That would not be the first time. See this old Windows-specific fix, or this more recent one (from 2019): https://github.com/Bioconductor/Rhtslib/blob/33bff969b86908627fb4677fa58e0c306157ddc8/src/htslib-1.18/kstring.c#L143-L200

This is pretty typical :disappointed:

Anyway, in this case, it looks like a self-inflicted issue. This old hack in Rsamtools finally backfired and what was meant to happen finally happened: bam_sort_core_ext argument list changed in htslib 1.18 so the bam_sort_core_ext prototype declared in Rsamtools/src/io_sam.c was no longer in sync with https://github.com/Bioconductor/Rhtslib/blob/33bff969b86908627fb4677fa58e0c306157ddc8/inst/include/bam_sort.c#L3202-L3207

This should now be fixed in Rhtslib 2.99.1 (commit 33bff969b86908627fb4677fa58e0c306157ddc8) and Rsamtools 2.19.1 (commit https://github.com/Bioconductor/Rsamtools/commit/732f0bbe9307999c50fd9c9a5944781e11ba140c).

Let's see how things go on tomorrow's build reports for Rsamtools, csaw, and raer.

hpages commented 11 months ago

All seems good on all platforms now.