starskyzheng / panpop

Application of pan-genome for population
MIT License
87 stars 8 forks source link

How does Panpop software process samples in parallel? #43

Closed JunlyMa closed 2 months ago

JunlyMa commented 4 months ago

Dear @starskyzheng,

Panpop is a powerful software for identifying structural variants (SVs) in next-generation sequencing (NGS) data, and I truly appreciate your work in developing this software. However, I'm facing an issue where I can only analyze one sample at a time when using the software to analyze NGS SVs. I haven't been able to find a batch submission method similar to submitting a for-loop script. Can Panpop handle parallel processing of multiple NGS samples to simultaneously generate files like ".gaffe.gam", ".gaffe.aug.gam", ".gaffe.aug.pg", etc.? Currently, I have over two hundred samples, and the speed is too slow. How should I proceed to utilize parallel computing?

Thank you very much.

starskyzheng commented 4 months ago

PanPop utilized snakemake to support parallel running and even running on cluster. You could read wiki of snakemake (https://snakemake.readthedocs.io/en/stable/) or just use PanPop for convinent.

JunlyMa commented 4 months ago

Thank you, I have learned some snake content, which has greatly improved my use of this program.. However, I found that a small number of samples always reach the generation of 2.callSV/.gaffe.aug.pg, .gaffe.aug.gam, .gaffe.aug.snarls, .gaffe.aug.q5.pack, and .gaffe.aug.q5.pack.DPinfo. And the generated .gaffe.aug.q5.call.ext.vcf.gz file is empty, but the program keeps running (it has been running for two days and still has no output, other normal samples will normally output .gaffe.aug.q5.call.ext.vcf.gz and .gaffe.aug.q5.call.vcf.gz files within 2-4 hours). After running these samples from the beginning many times, they are always stuck at this step, and the reason has not been found.

starskyzheng commented 4 months ago

Maybe the reads.fq.gz were incomplete?

github-actions[bot] commented 3 months ago

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] commented 2 months ago

This issue was closed because it has been inactive for 14 days since being marked as stale.