I am alway stuck at 3% done

Patcha-pou commented 11 months ago

During my attempts to run the Metagenomic Atlas software, I've encountered a recurring problem where the analysis becomes stuck at the 3% completion mark.

Anyway, there doesn't appear to be any significant increase in resource utilization during this phase.

I'm currently uncertain about its underlying cause. I'm considering various possibilities, such as potential issues with the input data, an incomplete software installation, or other factors that might be contributing to this problem.

Could you please suggest regarding this matter?

$ grep -A 8 rule $(ls -t .snakemake/log/* | head -1) rule initialize_qc: input: /media/msb/disky/Patcha/Pandao/Shotgun/B63/NS.2113.004.IDT_i7_22---IDT_i5_22.B63_R1.fastq.gz, /media/msb/disky/Patcha/Pandao/Shotgun/B63/NS.2113.004.IDT_i7_22---IDT_i5_22.B63_R2.fastq.gz output: NS.2113.004.IDT/sequence_quality_control/NS.2113.004.IDT_raw_R1.fastq.gz, NS.2113.004.IDT/sequence_quality_control/NS.2113.004.IDT_raw_R2.fastq.gz log: NS.2113.004.IDT/logs/QC/init.log jobid: 7 reason: Missing output files: NS.2113.004.IDT/sequence_quality_control/NS.2113.004.IDT_raw_R1.fastq.gz, NS.2113.004.IDT/sequence_quality_control/NS.2113.004.IDT_raw_R2.fastq.gz wildcards: sample=NS.2113.004.IDT priority: 80 threads: 10

rule build_decontamination_db: input: /media/msb/disky/Patcha/databases/phiX174_virus.fa output: ref/genome/1/summary.txt log: logs/QC/build_decontamination_db.log jobid: 9 reason: Missing output files: ref/genome/1/summary.txt threads: 8 resources: tmpdir=/tmp, mem=60, java_mem=51, mem_mb=60000, mem_mib=60000, time_min=300, runtime=300

-- localrule dram_download: output: /media/msb/disky/Patcha/databases/DRAM/db, /media/msb/disky/Patcha/databases/DRAM/DRAM.config log: logs/dram/download_dram.log jobid: 87 benchmark: logs/benchmarks/dram/download_dram.tsv reason: Missing output files: /media/msb/disky/Patcha/databases/DRAM/DRAM.config threads: 8 resources: tmpdir=/tmp, mem=60, time=5, mem_mb=60000, mem_mib=60000, time_min=300, runtime=300

-- rule get_read_stats: input: NS.2113.004.IDT/sequence_quality_control/NS.2113.004.IDT_raw_R1.fastq.gz, NS.2113.004.IDT/sequence_quality_control/NS.2113.004.IDT_raw_R2.fastq.gz output: NS.2113.004.IDT/sequence_quality_control/read_stats/raw.zip, NS.2113.004.IDT/sequence_quality_control/read_stats/raw_read_counts.tsv log: NS.2113.004.IDT/logs/QC/read_stats/raw.log jobid: 11 reason: Missing output files: NS.2113.004.IDT/sequence_quality_control/read_stats/raw_read_counts.tsv, NS.2113.004.IDT/sequence_quality_control/read_stats/raw.zip; Input files updated by another job: NS.2113.004.IDT/sequence_quality_control/NS.2113.004.IDT_raw_R1.fastq.gz, NS.2113.004.IDT/sequence_quality_control/NS.2113.004.IDT_raw_R2.fastq.gz wildcards: sample=NS.2113.004.IDT, step=raw priority: 30 threads: 10

SilasK commented 11 months ago

metagenomic assembly uses a lot of resources and a lot of steps need to be executed.. The percentage might not increase rapidly, but check the number of steps done.

You can try:

If you encounter errors. do grep Error -a5 <snakemake log file> and see the error files.
In the beginning, many databases need to be downloaded which can take time.
- You can run atlas run qc and download the dbs later
- You can deactivate some annotations in the config file.
You can try a test run (see example data in the docs).

Do you have a cluster to work with? Otherwise, you might be limited to doing 1 or 2 steps at a time which will drastically increase the time compared to parallelizing it on a cluster system

github-actions[bot] commented 9 months ago

There was no activity since some time. I hope your issue is solved in the mean time. This issue will automatically close soon if no further activity occurs.

Thank you for your contributions.

metagenome-atlas / atlas

I am alway stuck at 3% done #691