marbl / parsnp

Parsnp was designed to align the core genome of hundreds to thousands of bacterial genomes within a few minutes to few hours. Input can be both draft assemblies and finished genomes, and output includes variant (SNP) calls, core genome phylogeny and multi-alignments. Parsnp leverages contextual information provided by multi-alignments surrounding SNP sites for filtration/cleaning, in addition to existing tools for recombination detection/filtration and phylogenetic reconstruction.
Other
126 stars 25 forks source link

Parsnp memory issue #82

Open bkille opened 4 years ago

bkille commented 4 years ago

Dear bkille, Thank you very much for the software. I am trying to parsnp ~ 1700 Klebsiella pneumoniae genomes on a server cluster (ten nodes, 16 threads and 126G RAM per node), and parsnp 1.2 always come up with #81 this error (no difficulties in genomes alignment of a small sample sizes, up to 500 genomes). When runing on ~ 1700 genomes, I found the RAM increased gradually to 126GB and the job was then killed (out of memory). Thus I added -P 35000000 (or -P 35000, or -P 90000, or -P 90, -P 35, −P 35GB), while each won’t stop the memory usage increase up to 126GB, which resulted the job killed and my #81 error.

So, i am appreciate if you could tell:

Thanks in advance and have a nice weekend.

/Sun

Originally posted by @sunctx in https://github.com/marbl/parsnp/issues/81#issuecomment-667231796

bkille commented 4 years ago

@sunctx I used the Klebsiella genomes referenced in https://github.com/marbl/harvest/issues/22. There are roughly ~1300 of them. I also used the ANI recruitment strategy, since the default recruitment includes genomes that don't align well at all.

/usr/bin/time -v parsnp -d ~/Data/klebsiella/*.fna -r ~/Data/klebsiella/GCF_900093815.1_18090_8_78_genomic.fna -p 30 --use-ani --min-ani 98 -P 500000

The command finished and used ~28GB of RAM at peak usage (see time output below). With default partition size -P 1500000, the peak ram was much higher (around 90GB) but still under 128GB. With ~1700 genomes, I could see how you may run out of RAM with the default partition size. From my understanding of the source code, the partition size refers to the individual chunks that parsnp works with at one time (I believe it is in base pairs) but I can confirm with the code's author.

With your set of ~1700, decreasing the partition size should help. I'd be happy to help further as well. If you run your code with the --verbose flag and attach the output, I should be able to see where the binary runs out of memory.

User time (seconds): 167027.94                                                                                                                                                                                                                                            
        System time (seconds): 1039.54                                                                                                                                                                                                                                            
        Percent of CPU this job got: 429%                                                                                                                                                                                                                                         
        Elapsed (wall clock) time (h:mm:ss or m:ss): 10:52:35                                                                                                                                                                                                                     
        Average shared text size (kbytes): 0                                                                                                                                                                                                                                      
        Average unshared data size (kbytes): 0                                                                                                                                                                                                                                    
        Average stack size (kbytes): 0                                                                                                                                                                                                                                            
        Average total size (kbytes): 0                                                                                                                                                                                                                                            
        Maximum resident set size (kbytes): 28718316                                                                                                                                                                                                                              
        Average resident set size (kbytes): 0                                                                                                                                                                                                                                     
        Major (requiring I/O) page faults: 131                                                                                                                                                                                                                                    
        Minor (reclaiming a frame) page faults: 142279582                                                                                                                                                                                                                         
        Voluntary context switches: 783022                                                                                                                                                                                                                                        
        Involuntary context switches: 1530536                                                                                                                                                                                                                                     
        Swaps: 0                                                                                                                                                                                                                                                                  
        File system inputs: 14283680                                                                                                                                                                                                                                              
        File system outputs: 1700256                                                                                                                                                                                                                                              
        Socket messages sent: 0                                                                                                                                                                                                                                                   
        Socket messages received: 0                                                                                                                                                                                                                                               
        Signals delivered: 0                                                                                                                                                                                                                                                      
        Page size (bytes): 4096                                                                                                                                                                                                                                                   
        Exit status: 0