Memory corruption using bioconda::samtools + miniforge

jmunoz94 commented 7 months ago

Hi!

We use samtools tview to generate some output we then share via an internal API endpoint. Historically, we used samtools 1.9 + miniconda and everything worked as expected.

However, we had to move from miniconda to miniforge because of the update to miniconda's terms and conditions. Making the switch seemed to be relatively straightforward, till we noticed the endpoint was erroring for all requests. Right now I do not recall the exact error we were getting with samtools 1.9, but updating the library appeared to fix it. We moved to samtools 1.19 (which is the latest version as of today).

The endpoint works for most requests I have tried, but a unit test has been failing consistently. This unit test runs the command below:

samtools tview -d T -p chr1:99999999999999999999 some.bam hg19.fa

Which fails with

malloc(): memory corruption
Aborted

I set the verbosity level to 8 (--verbosity=8), but no extra information is given:

[I::hts_idx_check_local] Using alignment file 'some.bam'
malloc(): memory corruption
Aborted

I also noticed that the memory allocation error is not raised if I remove one 9 from the command above. So I wonder if it is either running out of memory or if the bam has issues around that position.

This issue does not happen at all (not even with all the 9s/bigger numbers) if I run samtools installed from brew.

I see that the output of samtools --version is different, so I was wondering if some flag set when installing samtools from bioconda using miniforge is to blame.

Bioconda + Miniforge:

samtools 1.19.2
Using htslib 1.19.1
Copyright (C) 2024 Genome Research Ltd.

Samtools compilation details:
    Features:       build=configure curses=yes 
    CC:             /opt/conda/conda-bld/samtools_1709670755015/_build_env/bin/x86_64-conda-linux-gnu-cc
    CPPFLAGS:       -DNDEBUG -D_FORTIFY_SOURCE=2 -O2 -isystem /conda/include
    CFLAGS:         -Wall -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /conda/include -fdebug-prefix-map=/opt/conda/conda-bld/samtools_1709670755015/work=/usr/local/src/conda/samtools-1.19.2 -fdebug-prefix-map=/conda=/usr/local/src/conda-prefix
    LDFLAGS:        -Wl,-O2 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now -Wl,--disable-new-dtags -Wl,--gc-sections -Wl,--allow-shlib-undefined -Wl,-rpath,/conda/lib -Wl,-rpath-link,/conda/lib -L/conda/lib
    HTSDIR:         
    LIBS:           
    CURSES_LIB:     -ltinfow -lncursesw

HTSlib compilation details:
    Features:       build=configure libcurl=yes S3=yes GCS=yes libdeflate=yes lzma=yes bzip2=yes plugins=yes plugin-path=/conda/libexec/htslib htscodecs=1.6.0
    CC:             /opt/conda/conda-bld/htslib_1709638118038/_build_env/bin/x86_64-conda-linux-gnu-cc
    CPPFLAGS:       -DNDEBUG -D_FORTIFY_SOURCE=2 -O2 -isystem /conda/include
    CFLAGS:         -Wall -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /conda/include -fdebug-prefix-map=/opt/conda/conda-bld/htslib_1709638118038/work=/usr/local/src/conda/htslib-1.19.1 -fdebug-prefix-map=/conda=/usr/local/src/conda-prefix -fvisibility=hidden
    LDFLAGS:        -Wl,-O2 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now -Wl,--disable-new-dtags -Wl,--gc-sections -Wl,--allow-shlib-undefined -Wl,-rpath,/conda/lib -Wl,-rpath-link,/conda/lib -L/conda/lib -fvisibility=hidden -rdynamic

HTSlib URL scheme handlers present:
    built-in:    preload, data, file
    Amazon S3:   s3+https, s3+http, s3
    libcurl:     imaps, pop3, gophers, http, smb, gopher, sftp, ftps, imap, smtp, smtps, rtsp, scp, ftp, telnet, mqtt, https, smbs, tftp, pop3s, dict
    S3 Multipart Upload:     s3w, s3w+https, s3w+http
    Google Cloud Storage:    gs+http, gs+https, gs
    crypt4gh-needed:     crypt4gh
    mem:     mem

Brew:

samtools 1.19.2
Using htslib 1.19.1
Copyright (C) 2024 Genome Research Ltd.

Samtools compilation details:
    Features:       build=configure curses=yes 
    CC:             clang
    CPPFLAGS:       
    CFLAGS:         -Wall -g -O2
    LDFLAGS:        
    HTSDIR:         
    LIBS:           
    CURSES_LIB:     -lncurses

HTSlib compilation details:
    Features:       build=configure libcurl=yes S3=yes GCS=yes libdeflate=no lzma=yes bzip2=yes plugins=no htscodecs=1.6.0
    CC:             clang
    CPPFLAGS:       
    CFLAGS:         -Wall -g -O2 -fvisibility=hidden
    LDFLAGS:        -fvisibility=hidden 

HTSlib URL scheme handlers present:
    built-in:    preload, data, file
    S3 Multipart Upload:     s3w, s3w+https, s3w+http
    Amazon S3:   s3+https, s3+http, s3
    Google Cloud Storage:    gs+http, gs+https, gs
    libcurl:     imaps, pop3, gophers, http, smb, gopher, ftps, imap, smtp, smtps, rtsp, ftp, telnet, mqtt, ldap, https, ldaps, smbs, tftp, pop3s, dict
    crypt4gh-needed:     crypt4gh
    mem:     mem

The problematic samtools installation is on an Ubuntu container:

uname -m returns "x86_64"

gcc --version returns:

gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

The container has not changed other than switching from miniconda to miniforge (using Miniforge3-4.10.0-0-Linux-x86_64, newer versions caused dependency issues for other libraries).

Is this a known issue with a known fix? I tried older versions of samtools and they all cause the same error, 1.9 does not, but it caused other issues (the ones I said I do not recall exactly, but it was breaking everything, even worse than 1.19).
Are there any other commands I could run to get a better idea of what the error is?
Should I install samtools differently? How?

I am opening an issue because I could not get much in terms of error messages and I could not find anything helpful googling. I am not very familiar with ~scientific stuff (including but not limited to bam files), but the bam in question opens in IGV (2.16.0) without issues (so I assume the file is OK?). I am not sharing the file because it is private, hopefully the other information is enough to determine what the issue is.

Thank you!

martin-g commented 7 months ago

Hi! You could try to build samtools yourself from sources and see whether it works or not. Bioconda downloads it from https://github.com/bioconda/bioconda-recipes/blob/76794693913ab648d37f7c4d2ddc14413dd9d206/recipes/samtools/meta.yaml#L13 And then builds it with these commands: https://github.com/bioconda/bioconda-recipes/blob/76794693913ab648d37f7c4d2ddc14413dd9d206/recipes/samtools/build.sh

jmarshall commented 6 months ago

99999999999999999999 is larger than 2⁶⁴. It shouldn't really crash (I haven't reproduced that yet), but it's also not something you should expect to work and is not very useful as a test case. Do you really have a chromosome that long? I suggest you reduce the position in your unit test to something rather more realistic.

Positions internally in samtools are signed 64-bit integers. I would not assume that positions beyond 9223372034707292159 work correctly.

jmunoz94 commented 6 months ago

Hi, thank you for your suggestions and feedback.

I tried building with the commands in the samtools recipe, but it first complains about htslib (it detects multiple installations, but "system" does not work). After explicitly setting the htslib path, it then starts complaining about other dependencies. I finally decided to try with apt, and I was positively surprised to see that it has version 1.17 (which is not the newest, but much more recent than the one we were using).

Regarding the test, it was added in 2018 and the person who added it is not around anymore. I assume she wanted to have a very large number there. Keeping it big but smaller than 2^64 makes sense (I will make the update), but it is still really weird that it only fails with this bioconda::samtools + miniforge combination.

I also noticed that another command yields different results:

samtools mpileup --fasta-ref hg19.fa --region chr1 --positions in_file --output out_file some.bam

bioconda::samtools	apt

in_file is a file with the following single line:

chr1    209806391       209806501

At first I thought this was because of max-depth*, but after explicitly setting it to 8000 results are still different. First and last lines in the output file are different, in the screenshots above I show how the first lines (1 to 15) differ.

I am not familiar with these formats, neither with the science behind them, and I suppose that not being able to provide the bam I am using does not help, but any ideas of what could cause that difference?

I also found the error that we were getting with samtools 1.9 (the older version we used before moving from miniconda to miniforge) was:

samtools tview: cannot read index for "some_file.bam"

Footnotes:

*I thought so because when running the command with the version that apt installs, it prints:

[mpileup] 1 samples in 1 input files
<mpileup> Set max per-file depth to 8000

While bioconda::samtools + miniforge only prints the first line.

jmarshall commented 6 months ago

The malloc(): memory corruption crash turns out to be a memory corruption buglet in tview when given enormous positions, introduced in samtools 1.10. As a memory corruption bug, it presents differently on different platforms and with different compilation options, which is why you didn't see the exact same behaviour from your brew build. This will be fixed by samtools/samtools#2032.

The workaround for your test case is the same as I suggested above. Samtools does not read in these 99999…99999 values correctly, as it only represents values up to 2⁶³-1 and does not currently reject extreme values beyond that. So I would recommend you adjust your test case to use a position within that range — moreover with 13 or fewer digits to avoid the crash you've discovered with existing samtools versions.

Me, I don't have any particular advice about your differing mpileup output, other than that such things are often due to BAQ recalculation and/or overlapping read handling. So you might try experimenting with the -B and -x options.

bioconda / bioconda-recipes

Memory corruption using bioconda::samtools + miniforge #47137