bacpop / unitig-counter

Uses cDBG to count unitigs in bacterial populations
GNU Affero General Public License v3.0
13 stars 2 forks source link

unitig-counter memory error #9

Open anna7287 opened 4 years ago

anna7287 commented 4 years ago

Hi John,

Im trying to run unitig counter using the below command. But I have a memory associated error. Do you know how I could fix this? Thanks

unitig-counter -strains strain_list.txt -output output -nb-cores 4

Building DBG and mapping strains on the DBG... [DSK: Pass 1/1, Step 2: counting kmers ] 47.7 % elapsed: 0 min 46 sec remaining: 0 min 51 sec cpu: 126.6 % mem: [ 79, 136, 179] MB EXCEPTION: Pool allocation failed for 584120 bytes (kmers alloc), mainbuffer is null?. Current usage is 584136 and capacity is 5242881088 Pool allocation failed for 560168 bytes (kmers alloc), mainbuffer is null?. Current usage is 1144312 and capacity is 5242881088 Pool allocation failed for 572896 bytes (kmers alloc), mainbuffer is null?. Current usage is 1717216 and capacity is 5242881088 Pool allocation failed for 577416 bytes (kmers alloc), mainbuffer is null?. Current usage is 2294648 and capacity is 5242881088

johnlees commented 4 years ago

I've not seen that before. How many samples are you looking at, and do you definitely have enough memory available on the system? You could maybe try using just a single core

anna7287 commented 4 years ago

I looking at 150 samples. I've tried running unitig-counter with 1 core and different samples (either half the number of samples, 1 sample or a single NCBI fasta) but I still end up with a similar pool allocation error. I think I should have enough memory as the usage is 584136 but the capacity is 5242881088.

rchikhi commented 4 years ago

that's odd. sometimes increasing the -max-memory parameter helps. It does seem that your dataset is quite small ( as evidenced by elapsed: 0 min 46 sec remaining: 0 min 51 sec). @johnless perhaps you could try exposing this parameter so that Anna can do a test run with -max-memory 10000?

santeripuranen commented 4 years ago

@anna7287 Sorry to intrude.. It's the pool_malloc function from gatb-core that's complaining (Edit: specifically after the if( mainbuffer == NULL ) test, which suggests that a previous call to calloc in MemAllocator::reserve has failed, I think; capacity is just an aspirational number the way it's set up now, it doesn't tell how much is actually available or allocated). I've never seen that happen before, so I'm curious, what version of unitig-counter are you running? Did you get it from Conda or did you compile from source? What hardware (CPU and such) are you running on? Can you try a different version of the code or some other machine? The idea is to try to narrow down the circumstances when pool allocation fails. It's also possible that there's something fishy with your input data, so you could also try with some entirely different data just to make sure.

rchikhi commented 4 years ago

hi @santeripuranen, as a GATB developer, I've seen Pool alloc failing before, usually because we've incorrectly estimated how much memory was needed, hence my max-memory comment. Cheers, Rayan

santeripuranen commented 4 years ago

@rchikhi May I suggest adding a test for mainbuffer == NULL in MemAllocator::reserve, after the call to calloc, and setting capacity = 0 if allocation failed (and guard the unsigned subtraction in pool_malloc). This would improve the quality of the exception message.

johnlees commented 4 years ago

Hi @rchikhi and @santeripuranen – thanks both for the insightful comments. Sounds like I need to make the -max-memory argument available. Will try and get round to this soon

rchikhi commented 4 years ago

@santeripuranen great suggestion, thanks. I've thrown an exception if that CALLOC() fails, to be on the safe side.

santeripuranen commented 4 years ago

@rchikhi Even better! Thanks!