metagenome-atlas / atlas

ATLAS - Three commands to start analyzing your metagenome data
https://metagenome-atlas.github.io/
BSD 3-Clause "New" or "Revised" License
364 stars 97 forks source link

I am alway stuck at 3% done #691

Closed Patcha-pou closed 9 months ago

Patcha-pou commented 11 months ago

During my attempts to run the Metagenomic Atlas software, I've encountered a recurring problem where the analysis becomes stuck at the 3% completion mark.

Anyway, there doesn't appear to be any significant increase in resource utilization during this phase.

I'm currently uncertain about its underlying cause. I'm considering various possibilities, such as potential issues with the input data, an incomplete software installation, or other factors that might be contributing to this problem.

Could you please suggest regarding this matter?

$ grep -A 8 rule $(ls -t .snakemake/log/* | head -1) rule initialize_qc: input: /media/msb/disky/Patcha/Pandao/Shotgun/B63/NS.2113.004.IDT_i7_22---IDT_i5_22.B63_R1.fastq.gz, /media/msb/disky/Patcha/Pandao/Shotgun/B63/NS.2113.004.IDT_i7_22---IDT_i5_22.B63_R2.fastq.gz output: NS.2113.004.IDT/sequence_quality_control/NS.2113.004.IDT_raw_R1.fastq.gz, NS.2113.004.IDT/sequence_quality_control/NS.2113.004.IDT_raw_R2.fastq.gz log: NS.2113.004.IDT/logs/QC/init.log jobid: 7 reason: Missing output files: NS.2113.004.IDT/sequence_quality_control/NS.2113.004.IDT_raw_R1.fastq.gz, NS.2113.004.IDT/sequence_quality_control/NS.2113.004.IDT_raw_R2.fastq.gz wildcards: sample=NS.2113.004.IDT priority: 80 threads: 10

rule build_decontamination_db: input: /media/msb/disky/Patcha/databases/phiX174_virus.fa output: ref/genome/1/summary.txt log: logs/QC/build_decontamination_db.log jobid: 9 reason: Missing output files: ref/genome/1/summary.txt threads: 8 resources: tmpdir=/tmp, mem=60, java_mem=51, mem_mb=60000, mem_mib=60000, time_min=300, runtime=300

-- localrule dram_download: output: /media/msb/disky/Patcha/databases/DRAM/db, /media/msb/disky/Patcha/databases/DRAM/DRAM.config log: logs/dram/download_dram.log jobid: 87 benchmark: logs/benchmarks/dram/download_dram.tsv reason: Missing output files: /media/msb/disky/Patcha/databases/DRAM/DRAM.config threads: 8 resources: tmpdir=/tmp, mem=60, time=5, mem_mb=60000, mem_mib=60000, time_min=300, runtime=300

-- rule get_read_stats: input: NS.2113.004.IDT/sequence_quality_control/NS.2113.004.IDT_raw_R1.fastq.gz, NS.2113.004.IDT/sequence_quality_control/NS.2113.004.IDT_raw_R2.fastq.gz output: NS.2113.004.IDT/sequence_quality_control/read_stats/raw.zip, NS.2113.004.IDT/sequence_quality_control/read_stats/raw_read_counts.tsv log: NS.2113.004.IDT/logs/QC/read_stats/raw.log jobid: 11 reason: Missing output files: NS.2113.004.IDT/sequence_quality_control/read_stats/raw_read_counts.tsv, NS.2113.004.IDT/sequence_quality_control/read_stats/raw.zip; Input files updated by another job: NS.2113.004.IDT/sequence_quality_control/NS.2113.004.IDT_raw_R1.fastq.gz, NS.2113.004.IDT/sequence_quality_control/NS.2113.004.IDT_raw_R2.fastq.gz wildcards: sample=NS.2113.004.IDT, step=raw priority: 30 threads: 10

SilasK commented 11 months ago

metagenomic assembly uses a lot of resources and a lot of steps need to be executed.. The percentage might not increase rapidly, but check the number of steps done.

You can try:

Do you have a cluster to work with? Otherwise, you might be limited to doing 1 or 2 steps at a time which will drastically increase the time compared to parallelizing it on a cluster system

github-actions[bot] commented 9 months ago

There was no activity since some time. I hope your issue is solved in the mean time. This issue will automatically close soon if no further activity occurs.

Thank you for your contributions.