lmrodriguezr / nonpareil

Estimate metagenomic coverage and sequence diversity
http://enve-omics.ce.gatech.edu/nonpareil/
Other
44 stars 11 forks source link

"std::bad_alloc" and "Cannot open the file" errors #64

Closed hkaspersen closed 3 months ago

hkaspersen commented 8 months ago

I am trying to run the following command on an unzipped fastq-file (only one fastq file in the dir):

nonpareil -s *fastq -T kmer -f fastq -X 10000 -n 2048 -k 24 -t 16

With Nonpareil version 3.4.1, this command results in the following error after running for approximately 20-30 minutes:

Nonpareil v3.401
 [      0.0]  reading 100-B5-3_S15_L002_R1_001.fastq
 [      0.0]  Picking 10000 random sequences
 [      0.0]  Started counting
Fatal error:
Cannot open the file
 [     18.5] Fatal error: Cannot open the file

To test, I switched to version 3.3.3, using the exact same data and command as above, but then I got the following error:

Nonpareil v3.303
 [      0.0]  reading 100-B5-3_S15_L002_R1_001.fastq
 [      0.0]  Picking 10000 random sequences
 [      0.1]  Started counting
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Aborted (core dumped)

In both cases, I had allocated the process 40Gb of memory and 16 CPUs. The file itself is not corrupt or otherwise compromised. I am running Nonpareil through a conda installation (separate envs for each version). I cannot figure out what the problem is here. Any help would be appreciated!

lmrodriguezr commented 3 months ago

Dear @hkaspersen Please take a look at the new version of Nonpareil and let us know if this remains an issue. I'm closing it for now, but please feel free to reopen.