pangenome / pggb

the pangenome graph builder
https://doi.org/10.1038/s41592-024-02430-3
MIT License
369 stars 40 forks source link

set up Memory and threads of pggb #36

Closed XuewenWangUGA closed 2 years ago

XuewenWangUGA commented 3 years ago

Dear Sir/Madam,

Is there any information on how to set up the memory and threads of pggb? For example, there are seven genomic sequences, each with 100 M bp. For a fast analysis, there is a balance between the memory for each thread and the number of threads. How to balance it? What is the mimimun memory for each thread, e.g. 4 G? More threads will play more important role than the the memory of each thread in the analysis? Thanks.

fmobegi commented 3 years ago

Am having the same issue with pggb. Tried with 13 genomes each ~45Mbp on an HPC cluster but the tasks keep dying. I was hoping to see an answer on this issue here :(

subwaystation commented 3 years ago

As PGGB is getting more and more stable, we will extensively test this in the near future. We will suggest default parameters for different species like human, yeast, arabidopsis, mouse.

Key parameters which will eat most of your RAM are -w, --block-weight-max N and -T, --poa-threads N. Setting the first one to 170000 and the second one to 24 on a 48 threads node with 128GB RAM, we are able to produce chromosomal pangenome graphs for ~40 human assemblies plus the two reference grch38 and chm13.

subwaystation commented 3 years ago

I hope this gives you a hint.