GATB / bcalm

compacted de Bruijn graph construction in low memory
MIT License
99 stars 20 forks source link

Running bcalm raises error: `cannot create a union-find data structure, too many elements.` #53

Open halexand opened 4 years ago

halexand commented 4 years ago

Hello,

I am running bcalm as part of the spacegraph cats pipeline in collaboration with @taylorreiter. The command I am running is: bcalm -in INFILES -out OUTFILE -kmer-size 31 -abundance-min 1

I got an interesting error today that ended with:

UF MPHF constructed (4803 MB)                   01:52:39     memory [current, maxRSS]: [19184, 88111] MB 
cannot create a union-find data structure, too many elements. This should in fact not even happen. Please contact a BCALM developer

So, I am contacting you.

You can see the full error here: bcalm.MS-all-SRF-5-20.00.k31.unitigs.fa.log.txt

Any insight into what this error might indicate?

Thank you!

rchikhi commented 4 years ago

Hi, thanks for the detailed report.

1) Try using -min-abundance 2 to make that instance a bit smaller, unless you really have a good reason to care about kmers seen only once, or, 2) Wait for a short while. I've seen that bug on my end too. It happened on a huge instance with more than 70 billion distinct kmers. I'll work on fixing it.

Best, Rayan

halexand commented 4 years ago

Thanks so much! We are trying to deal with some pretty complex samples... so I think that is likely the issue.

We are trying a new run with the -min-abundance 2 for now to make the instance smaller-- but would be curious to try out any fix you might have in the future.

Best,

Harriet

PandorasRed commented 2 years ago

hello, i have the same problem, is the fix still on the work to do?

sebschmi commented 1 year ago

Same problem here :(

rchikhi commented 1 year ago

hi, no fix in the works, seems to be a fundamental limitation of BCALM2 so far. I'd recommend you try the excellent Cuttlefish2 software: https://github.com/COMBINE-lab/cuttlefish

sebschmi commented 1 year ago

Great, thanks!