vgteam / vg

tools for working with genome variation graphs
https://biostars.org/tag/vg/
Other
1.1k stars 193 forks source link

autoindex killed by the cgroup out-of-memory handler #4118

Closed Ahahaha3 closed 10 months ago

Ahahaha3 commented 11 months ago

Hi, i used the command "~/software/vg autoindex -t 60 --workflow giraffe --target-mem 240G -r genome.fa -v LG01.vcf.gz -p pangenome" to create index, but i got an error "slurmstepd: error: Detected 1 oom-kill event(s) in StepId=4378567.batch cgroup. Some of your processes may have been killed by the cgroup out-of-memory handler", the vcf.gz file just 42M and i set the threads 60 and mem 240G, i can't understand why. The genome file about 3G.

jeizenga commented 11 months ago

The --target-mem option is more of a loose limit than a firm one. It's difficult to robustly predict future memory use, so vg autoindex does the best that it can to approximate it, but there will still be times that it goes over the target. You can try to coax it into lower memory use by choosing an even lower --target-mem. The cost of doing that is that sometimes it won't use all of the threads it has available. Also, if an individual task requires more memory than your OOM limit, then there's not much that vg autoindex can do about it.

Can you copy the logging output of the command? That might help pinpoint where the memory use is coming from.

Ahahaha3 commented 11 months ago

slurm-4378567.log thanks for your reply! this is the log file.

jeizenga commented 11 months ago

Interesting, that's not usually an especially memory-intensive step. It could depend on some features of your data. Some figures that might be useful:

I also see that there are structural variants in this VCF. Is it primarily SVs, or are there also small variants?

Ahahaha3 commented 11 months ago

I used syri to re-align and got a vcf file, the memory problem did not occur, but have a new error:

error:[vg::Constructor] non-ATGCN characters found in variant: LG09 1009 INVTR120847 N 0 PASS ChrB=CM040243.1;DupType=-;END=1578;EndB=68875429;Parent=.;StartB=68874859;VarType=SR

jeizenga commented 11 months ago

That looks like an invalid VCF. Either the reference or variant allele is missing in that line.

Ahahaha3 commented 11 months ago

sorry, there is something wrong with the previous description, the error is like this: LG09 1009 INVTR120847 N INVTR 0 PASS ChrB=CM040243.1;DupType=-;END=1578;EndB=68875429;Paren=.;StartB=68874859;VarType=SR

jeizenga commented 11 months ago

That's still a malformed VCF record. It looks like it's missing the <> around a symbolic allele. As a forewarning, I don't think VG will be able to add this variant into the graph even if it's fixed. As far as I know, VG only supports INS, DEL, and INV symbolic alleles.

jeizenga commented 11 months ago

I probably should clarify that VG will stop throwing an error if this formatting error is corrected. It just won't be able to add it into the graph.