PacificBiosciences / pb-metagenomics-tools

Tools and pipelines tailored to using PacBio HiFi Reads for metagenomics
BSD 3-Clause Clear License
178 stars 36 forks source link

SemiBin2 fails to generate coverage files - Snakemake rule failure #69

Closed MicroSeq closed 11 months ago

MicroSeq commented 12 months ago

I've had difficulties trying to resolve this issue as the log file is vague. I was able to overcome it by removing Semibin from the workflow but I've tried executing on two different systems with different inputs now and ran into the same issue:

log: SemiBin[2939283] INFO Binning for long_read SemiBin[2939283] INFO Did not detect GPU, using CPU. SemiBin[2939283] INFO Generating training data... SemiBin[2939283] INFO Calculating coverage for every sample. Error: Generating coverage file fail

Output from snakemake: rule SemiBin2Analysis: input: /pb-metagenomics-tools/HiFi-MAG-Pipeline/1-long-contigs/Qps_2ml_frozen_test/Qps_2ml_frozen_test.incomplete_contigs.fasta, /pb-metagenomics-tools/HiFi-MAG-Pipeline/2-bam/Qps_2ml_frozen_test.bam output: /pb-metagenomics-tools/HiFi-MAG-Pipeline/3-semibin2/Qps_2ml_frozen_test/bins_info.tsv, /pb-metagenomics-tools/HiFi-MAG-Pipeline/3-semibin2/Qps_2ml_frozen_test log: /pb-metagenomics-tools/HiFi-MAG-Pipeline/logs/Qps_2ml_frozen_test.SemiBin2Analysis.log jobid: 24 benchmark: /pb-metagenomics-tools/HiFi-MAG-Pipeline/benchmarks/Qps_2ml_frozen_test.SemiBin2Analysis.tsv reason: Missing output files: /pb-metagenomics-tools/HiFi-MAG-Pipeline/3-semibin2/Qps_2ml_frozen_test wildcards: sample=Qps_2ml_frozen_test threads: 24 resources: tmpdir=/tmp

Activating conda environment: .snakemake/conda/282ff4bae0d22589d40068be7e4c30dc_

Error in rule SemiBin2Analysis: jobid: 24 input: /pb-metagenomics-tools/HiFi-MAG-Pipeline/1-long-contigs/Qps_2ml_frozen_test/Qps_2ml_frozen_test.incomplete_contigs.fasta, /pb-metagenomics-tools/HiFi-MAG-Pipeline/2-bam/Qps_2ml_frozen_test.bam output: /pb-metagenomics-tools/HiFi-MAG-Pipeline/3-semibin2/Qps_2ml_frozen_test/bins_info.tsv, /pb-metagenomics-tools/HiFi-MAG-Pipeline/3-semibin2/Qps_2ml_frozen_test log: /pb-metagenomics-tools/HiFi-MAG-Pipeline/logs/Qps_2ml_frozentest.SemiBin2Analysis.log (check log file(s) for error details) conda-env: /pb-metagenomics-tools/HiFi-MAG-Pipeline/.snakemake/conda/282ff4bae0d22589d40068be7e4c30dc shell: SemiBin single_easy_bin -i /pb-metagenomics-tools/HiFi-MAG-Pipeline/1-long-contigs/Qps_2ml_frozen_test/Qps_2ml_frozen_test.incomplete_contigs.fasta -b /pb-metagenomics-tools/HiFi-MAG-Pipeline/2-bam/Qps_2ml_frozen_test.bam -o /pb-metagenomics-tools/HiFi-MAG-Pipeline/3-semibin2/Qps_2ml_frozen_test --self-supervised --sequencing-type=long_reads --compression=none -t 24 --tag-output semibin2 --environment=global --verbose --tmpdir=/GTDB/Scratch &> /pb-metagenomics-tools/HiFi-MAG-Pipeline/logs/Qps_2ml_frozen_test.SemiBin2Analysis.log (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

MicroSeq commented 12 months ago

I resolved this by updating the semibin version. It seems to execute now, I will follow up if it causes any downstream issues with the workflow.

Note that you need to change the channel priorities for the install to work using strict.

envs/semibin.yml name: semibin_env channels:

dportik commented 11 months ago

Hi @MicroSeq , Thanks for the information about this bug. I'll keep an eye on things as SemiBin2 continues to get updated. I had incorporated SemiBin2 into the workflow before its official release, so the command line syntax is clunky. I will want to update this to the new syntax soon, and that will require bumping the version up as well.