lmrodriguezr / nonpareil

Estimate metagenomic coverage and sequence diversity
http://enve-omics.ce.gatech.edu/nonpareil/
Other
42 stars 11 forks source link

Suggestion on how to debug 'buffer overflow detected' error? #68

Closed jfy133 closed 1 day ago

jfy133 commented 1 week ago

I was double checking the bioconda version of Nonpareil 3.5.3, but unfortunately everything I try results in the following error:

$ nonpareil -s test.fastq -T kmer -f fastq -R 10 -v 10
Nonpareil v3.5.3
 [      0.0]  Reading test.fastq
 [      0.0]   Picking 10000 random sequences
 [      0.0]   Counting kmers
 [      0.0]  Read file with 632060 sequences
 [      0.0]  Average read length is 151.000000bp
 [      0.0]          Worker 0 @start_samples.
 [      0.0]  Sub-sampling library
*** buffer overflow detected ***: terminated                        
Aborted (core dumped)

Even when the same command works correctly if I use the 3.5.3 binary from here (running on Linux)

But I'm not sure how to debug this error, increasing the verbosity isn't providing much more information.

I'm wondering if we are somehow missing some dependency (given worker 0 e.g., related to multitheading)?

Do you see anything missing on the conda recipe dependency list?

https://github.com/bioconda/bioconda-recipes/blob/be4b35dbdcda3e55c0960bec2097cbf8e44322ac/recipes/nonpareil/meta.yaml#L12-L28

Or have any further suggestions on how to debug?

fgvieira commented 1 week ago

Same here!

lmrodriguezr commented 1 day ago

Dear both,

I can reproduce the issue with the conda bottle, but I have not been able to reproduce a compilation environment that causes the same problem. Even when I use a binary pointing to the exact same libraries:

(nonpareil-test) [c7181116@login.leo5 nonpareil]$ ldd $(which nonpareil)
    linux-vdso.so.1 (0x00007ffea6bf1000)
    libpthread.so.0 => /lib64/libpthread.so.0 (0x00001482a42ed000)
    libz.so.1 => /home/c718/c7181116/.conda/envs/nonpareil-test/bin/../lib/libz.so.1 (0x00001482a46ed000)
    libstdc++.so.6 => /home/c718/c7181116/.conda/envs/nonpareil-test/bin/../lib/libstdc++.so.6 (0x00001482a410a000)
    libm.so.6 => /lib64/libm.so.6 (0x00001482a3d88000)
    libgcc_s.so.1 => /home/c718/c7181116/.conda/envs/nonpareil-test/bin/../lib/libgcc_s.so.1 (0x00001482a46ce000)
    libc.so.6 => /lib64/libc.so.6 (0x00001482a39c3000)
    /lib64/ld-linux-x86-64.so.2 (0x00001482a450d000)
    librt.so.1 => /lib64/librt.so.1 (0x00001482a37bb000)
(nonpareil-test) [c7181116@login.leo5 nonpareil]$ nonpareil -s test/test.fastq.gz -T kmer -f fastq -R 10 -v 10 -d 0 -b o -t 10
Nonpareil v3.5.3
 [      0.0]   The file test/test.fastq.gz.enve-tmp.162392 was created
 [      0.0]    Reducing query reads (-X) to 375
 [      0.0]  Reading test/test.fastq.gz.enve-tmp.162392
 [      0.0]   Picking 375 random sequences
 [      0.0]   Counting kmers
 [      0.0]  Read file with 500 sequences
 [      0.0]  Average read length is 88.496000bp
 [      0.0]          Worker 0 @start_samples.
 [      0.0]  Sub-sampling library
*** buffer overflow detected ***: nonpareil terminated              
Aborted (core dumped)
(nonpareil-test) [c7181116@login.leo5 nonpareil]$ LD_LIBRARY_PATH=/home/c718/c7181116/.conda/envs/nonpareil-test/bin/../lib/ ldd ./nonpareil
    linux-vdso.so.1 (0x00007ffc037c9000)
    libpthread.so.0 => /lib64/libpthread.so.0 (0x000014a9c1ec5000)
    libz.so.1 => /home/c718/c7181116/.conda/envs/nonpareil-test/bin/../lib/libz.so.1 (0x000014a9c22e4000)
    libstdc++.so.6 => /home/c718/c7181116/.conda/envs/nonpareil-test/bin/../lib/libstdc++.so.6 (0x000014a9c1ce2000)
    libm.so.6 => /lib64/libm.so.6 (0x000014a9c1960000)
    libgcc_s.so.1 => /home/c718/c7181116/.conda/envs/nonpareil-test/bin/../lib/libgcc_s.so.1 (0x000014a9c22c5000)
    libc.so.6 => /lib64/libc.so.6 (0x000014a9c159b000)
    /lib64/ld-linux-x86-64.so.2 (0x000014a9c20e5000)
    librt.so.1 => /lib64/librt.so.1 (0x000014a9c1393000)
(nonpareil-test) [c7181116@login.leo5 nonpareil]$ LD_LIBRARY_PATH=/home/c718/c7181116/.conda/envs/nonpareil-test/bin/../lib/ ./nonpareil -s test/test.fastq.gz -T kmer -f fastq -R 10 -v 10
Nonpareil v3.5.3
 [      0.0]   The file test/test.fastq.gz.enve-tmp.163214 was created
 [      0.0]    Reducing query reads (-X) to 375
 [      0.0]  Reading test/test.fastq.gz.enve-tmp.163214
 [      0.0]   Picking 375 random sequences
 [      0.0]   Counting kmers
 [      0.0]  Read file with 500 sequences
 [      0.0]  Average read length is 88.496000bp
 [      0.0]          Worker 0 @start_samples.
 [      0.0]  Sub-sampling library
 [      0.0]          Worker 0 @start_checkings.                      
 [      0.0]  Evaluating consistency                              
 [      0.0]  Everything seems correct
 [      0.0]          Worker 0 @exit.

Any ideas on how I could access the compilation environment used by conda? I'm not well-versed on the conda universe, but presumably there is a way of reproducing the environment locally, right?

In the meantime, I'm adding a couple of debug messages (with -v 9) to try an narrow-down where does the issue originate. Do we need a release for it or can we use the git HEAD?

Thank you!!! Miguel.

jfy133 commented 1 day ago

My rough self-notes are here how to get into the bioconda build environment.

https://hackmd.io/@jfy133/ryXNpa9Op

TL;DR, install [bioconda-utils](https://bioconda.github.io/contributor/building-locally.html#id2) with conda, run bioconda-utils build --docker --mulled-test --packages, then follow the instructions are the bottom of that link called 'Debugging bioconda-utils'.

In this case you will want to fork the bioconda repo, and on your fork modify the URL to take HEAD (or a relase, or commit, anything that generates a tarball), and update the SHA256 hash, and then the build number.

Then run the bioconda-utils commanda above to test the recipe. Run it once so you have access to the build directory. to follow the instructions in my notes above.

I have a summer school to teach in less than a month so I have less time for this at the moment, but happy to reply here with suggestions on getting into the bioconda build env etc.

lmrodriguezr commented 1 day ago

Should be solved in https://github.com/bioconda/bioconda-recipes/pull/49008

jfy133 commented 1 day ago

Woot! Thank you @lmrodriguezr ! How did you solve the issue?

lmrodriguezr commented 1 day ago

It was a wrong buffer length variable. In practice, this should have never caused problems, since it would only overflow with numbers longer than 126 digits in decimal. However, it might be that the implementation of snprintf used in the conda environment asserts memory availability(?)

In any case, the actual fix was simple: I replaced LARGEST_LINE with LARGEST_LABEL . Finding it was trickier, but I just littered the code with debug messages and then got rid of them 😅

jfy133 commented 1 day ago

Awesome, thank you! bioconda merged in, I will test in 5 minutes to give a chance to the repo cache to update and report back :)

jfy133 commented 1 day ago

I can confirm now works for me on bioconda now @lmrodriguezr ! Thank you very much :D

My very last request is the one here :pray: : https://github.com/lmrodriguezr/nonpareil/issues/63