gbouras13 / hybracter

Automated long-read first bacterial genome assembly tool implemented in Snakemake using Snaketool.
MIT License
108 stars 8 forks source link

Error in rule medaka_incomplete #62

Open damientully opened 8 months ago

damientully commented 8 months ago

Hi,

I appear to be getting the following error when running the following command hybracter test-long which I presume is related to medaka. I wondered if you have encountered this issue before and whether you may have a solution?

Many thanks Damien

Finished job 58.
18 of 47 steps (38%) done
Select jobs to execute...
Execute 1 jobs...

[Thu Mar 21 10:19:24 2024]
localrule dnaapler_pre_chrom:
    input: hybracter_out/processing/pre_polish/Sample1_chromosome.fasta, hybracter_out/processing/pre_polish/Sample1_ignore_list.txt
    output: hybracter_out/processing/complete/dnaapler/Sample1_pre_chrom/Sample1_reoriented.fasta
    log: hybracter_out/stderr/dnaapler/Sample1_pre_chrom.log
    jobid: 64
    benchmark: hybracter_out/benchmarks/dnaapler/Sample1_pre_chrom.txt
    reason: Missing output files: hybracter_out/processing/complete/dnaapler/Sample1_pre_chrom/Sample1_reoriented.fasta; Input files updated by another job: hybracter_out/processing/pre_polish/Sample1_chromosome.fasta, hybracter_out/processing/pre_polish/Sample1_ignore_list.txt
    wildcards: sample=Sample1
    threads: 8
    resources: tmpdir=/var/folders/v6/3dzc71_x309c1sxz_sghh9_h0000gp/T, mem_mb=15259, mem_mib=15259, mem=16000MB, time=08:00:00

        dnaapler all -i hybracter_out/processing/pre_polish/Sample1_chromosome.fasta -o hybracter_out/processing/complete/dnaapler/Sample1_pre_chrom --ignore hybracter_out/processing/pre_polish/Sample1_ignore_list.txt -p Sample1 -t 8 -a nearest --db dnaa,repa -f 2> hybracter_out/stderr/dnaapler/Sample1_pre_chrom.log

Activating conda environment: ../miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.12/site-packages/hybracter/workflow/conda/df445695c6e5812bf1a0f901a98826d9_
[Thu Mar 21 10:19:25 2024]
Error in rule plassembler_long:
    jobid: 59
    input: hybracter_out/processing/qc/Sample1_filt_trim.fastq.gz
    output: hybracter_out/processing/plassembler/Sample1/plassembler_plasmids.fasta, hybracter_out/processing/plassembler/Sample1/plassembler_summary.tsv, hybracter_out/versions/Sample1/plassembler.version
    log: hybracter_out/stderr/plassembler_long/Sample1.log (check log file(s) for error details)
    conda-env: /Users/eidedtul/miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.12/site-packages/hybracter/workflow/conda/5039853012fe4edf845348cd60513b78_
    shell:

        if unicycler --version ; then
             plassembler long -l hybracter_out/processing/qc/Sample1_filt_trim.fastq.gz -o hybracter_out/processing/plassembler/Sample1 -d /Users/eidedtul/miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.12/site-packages/hybracter/workflow/../test_data/Plassembler_DB_Test -t 16 -c 50000 --skip_qc --flye_directory hybracter_out/processing/assemblies/Sample1 --depth_filter 0.25 -f 2> hybracter_out/stderr/plassembler_long/Sample1.log
        else
            pip install git+https://github.com/rrwick/Unicycler.git
            plassembler long -l hybracter_out/processing/qc/Sample1_filt_trim.fastq.gz -o hybracter_out/processing/plassembler/Sample1 -d /Users/eidedtul/miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.12/site-packages/hybracter/workflow/../test_data/Plassembler_DB_Test -t 16 -c 50000 --skip_qc --flye_directory hybracter_out/processing/assemblies/Sample1 --depth_filter 0.25 -f 2> hybracter_out/stderr/plassembler_long/Sample1.log
        fi
        touch hybracter_out/processing/plassembler/Sample1/plassembler_plasmids.fasta
        touch hybracter_out/processing/plassembler/Sample1/plassembler_summary.tsv
        plassembler --version > hybracter_out/versions/Sample1/plassembler.version

        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Logfile hybracter_out/stderr/plassembler_long/Sample1.log:
================================================================================
2024-03-21 10:19:24.584 | INFO     | plassembler:begin_plassembler:100 - You are using Plassembler version 1.6.2
2024-03-21 10:19:24.584 | INFO     | plassembler:begin_plassembler:101 - Repository homepage is https://github.com/gbouras13/plassembler
2024-03-21 10:19:24.585 | INFO     | plassembler:begin_plassembler:102 - Written by George Bouras: george.bouras@adelaide.edu.au
2024-03-21 10:19:24.585 | INFO     | plassembler:long:1294 - Database directory is /Users/eidedtul/miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.12/site-packages/hybracter/workflow/../test_data/Plassembler_DB_Test
2024-03-21 10:19:24.585 | INFO     | plassembler:long:1295 - Longreads file is hybracter_out/processing/qc/Sample1_filt_trim.fastq.gz
2024-03-21 10:19:24.585 | INFO     | plassembler:long:1296 - Chromosome length threshold is 50000
2024-03-21 10:19:24.585 | INFO     | plassembler:long:1297 - Output directory is hybracter_out/processing/plassembler/Sample1
2024-03-21 10:19:24.585 | INFO     | plassembler:long:1298 - Min long read length is 500
2024-03-21 10:19:24.585 | INFO     | plassembler:long:1299 - Min long read quality is 9
2024-03-21 10:19:24.585 | INFO     | plassembler:long:1300 - Thread count is 16
2024-03-21 10:19:24.585 | INFO     | plassembler:long:1301 - --force is True
2024-03-21 10:19:24.585 | INFO     | plassembler:long:1302 - --skip_qc is True
2024-03-21 10:19:24.585 | INFO     | plassembler:long:1303 - --raw_flag is False
2024-03-21 10:19:24.586 | INFO     | plassembler:long:1304 - --pacbio_model is nothing
2024-03-21 10:19:24.586 | INFO     | plassembler:long:1305 - --keep_chromosome is False
2024-03-21 10:19:24.586 | INFO     | plassembler:long:1306 - --flye_directory is hybracter_out/processing/assemblies/Sample1
2024-03-21 10:19:24.586 | INFO     | plassembler:long:1307 - --flye_assembly is nothing
2024-03-21 10:19:24.586 | INFO     | plassembler:long:1308 - --flye_info is nothing
2024-03-21 10:19:24.586 | INFO     | plassembler:long:1309 - --corrected_error_rate is 0.12
2024-03-21 10:19:24.586 | INFO     | plassembler:long:1310 - --no_chromosome is False
2024-03-21 10:19:24.586 | INFO     | plassembler:long:1311 - --depth_filter is 0.25
2024-03-21 10:19:24.586 | INFO     | plassembler:long:1312 - --unicycler_options is None
2024-03-21 10:19:24.586 | INFO     | plassembler:long:1313 - --spades_options is None
2024-03-21 10:19:24.586 | INFO     | plassembler:long:1317 - Checking dependencies
2024-03-21 10:19:24.760 | INFO     | plassembler.utils.input_commands:check_dependencies:199 - Flye version found is v2.9.3-b1797.
2024-03-21 10:19:24.761 | INFO     | plassembler.utils.input_commands:check_dependencies:209 - Flye version is ok.
2024-03-21 10:19:24.784 | INFO     | plassembler.utils.input_commands:check_dependencies:218 - Raven v1.8.3 found.
2024-03-21 10:19:24.785 | INFO     | plassembler.utils.input_commands:check_dependencies:220 - Raven version is ok.
2024-03-21 10:19:24.964 | INFO     | plassembler.utils.input_commands:check_dependencies:242 - Unicycler version found is v0.5.0.
2024-03-21 10:19:24.965 | INFO     | plassembler.utils.input_commands:check_dependencies:255 - Unicycler version is ok.
2024-03-21 10:19:25.543 | ERROR    | plassembler.utils.input_commands:check_dependencies:267 - SPAdes not found.
================================================================================

[Thu Mar 21 10:19:32 2024]
Error in rule medaka_incomplete:
    jobid: 41
    input: hybracter_out/processing/incomp_pre_polish/Sample2.fasta, hybracter_out/processing/qc/Sample2_filt_trim.fastq.gz
    output: hybracter_out/processing/incomplete/medaka_incomplete/Sample2/consensus.fasta, hybracter_out/versions/Sample2/medaka_incomplete.version, hybracter_out/supplementary_results/intermediate_incomplete_assemblies/Sample2/Sample2_medaka.fasta
    log: hybracter_out/stderr/medaka_incomplete/Sample2.log (check log file(s) for error details)
    conda-env: /Users/eidedtul/miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.12/site-packages/hybracter/workflow/conda/3a238a896824eb2007e785c8a56e5932_
    shell:

        medaka_consensus -i hybracter_out/processing/qc/Sample2_filt_trim.fastq.gz -d hybracter_out/processing/incomp_pre_polish/Sample2.fasta -o hybracter_out/processing/incomplete/medaka_incomplete/Sample2 -m r1041_e82_400bps_sup_v4.2.0  -t 16 2> hybracter_out/stderr/medaka_incomplete/Sample2.log
        medaka --version > hybracter_out/versions/Sample2/medaka_incomplete.version
        cp hybracter_out/processing/incomplete/medaka_incomplete/Sample2/consensus.fasta hybracter_out/supplementary_results/intermediate_incomplete_assemblies/Sample2/Sample2_medaka.fasta
        touch hybracter_out/processing/incomplete/medaka_incomplete/Sample2/calls_to_draft.bam
        rm hybracter_out/processing/incomplete/medaka_incomplete/Sample2/calls_to_draft.bam
        touch hybracter_out/processing/incomplete/medaka_incomplete/Sample2/consensus_probs.hdf
        rm hybracter_out/processing/incomplete/medaka_incomplete/Sample2/consensus_probs.hdf

        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Logfile hybracter_out/stderr/medaka_incomplete/Sample2.log:
================================================================================
  File "<string>", line 1
    import sys; import itertools; mods=[f'{x[:-1]}' for x in sys.stdin.readline().split() if ('variant' not in x and 'snp' not in x)]; print(' '.join(mods))
                                                  ^
SyntaxError: invalid syntax
================================================================================

[Thu Mar 21 10:19:38 2024]
Finished job 64.
19 of 47 steps (40%) done
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2024-03-21T101748.144217.snakemake.log
WorkflowError:
At least one job did not complete successfully.
gbouras13 commented 8 months ago

Hi @damientully ,

Looks like the error is probably with plassembler_long (the rule above) which cascades down to Sedaka - based on the log, the environment doesn't have spades

Please try:

conda activate /Users/eidedtul/miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.12/site-packages/hybracter/workflow/conda/5039853012fe4edf845348cd60513b78_
mamba install  bioconda::spades
conda deactivate

If you still get a Medaka error after that, please try

conda activate /Users/eidedtul/miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.12/site-packages/hybracter/workflow/conda/3a238a896824eb2007e785c8a56e5932_
mamba uninstall medaka
mamba install medaka 
conda deactivate

George

damientully commented 8 months ago

Thanks @gbouras13

Funny I have tried this:

conda activate /Users/eidedtul/miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.12/site-packages/hybracter/workflow/conda/5039853012fe4edf845348cd60513b78_
mamba install  bioconda::spades
conda deactivate

but I still get an error with spades:

Error in rule plassembler_long:
    jobid: 43
    input: hybracter_out/processing/qc/Sample1_filt_trim.fastq.gz
    output: hybracter_out/processing/plassembler/Sample1/plassembler_plasmids.fasta, hybracter_out/processing/plassembler/Sample1/plassembler_summary.tsv, hybracter_out/versions/Sample1/plassembler.version
    log: hybracter_out/stderr/plassembler_long/Sample1.log (check log file(s) for error details)
    conda-env: /Users/eidedtul/miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.12/site-packages/hybracter/workflow/conda/5039853012fe4edf845348cd60513b78_
    shell:

        if unicycler --version ; then
             plassembler long -l hybracter_out/processing/qc/Sample1_filt_trim.fastq.gz -o hybracter_out/processing/plassembler/Sample1 -d /Users/eidedtul/miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.12/site-packages/hybracter/workflow/../test_data/Plassembler_DB_Test -t 16 -c 50000 --skip_qc --flye_directory hybracter_out/processing/assemblies/Sample1 --depth_filter 0.25 -f 2> hybracter_out/stderr/plassembler_long/Sample1.log
        else
            pip install git+https://github.com/rrwick/Unicycler.git
            plassembler long -l hybracter_out/processing/qc/Sample1_filt_trim.fastq.gz -o hybracter_out/processing/plassembler/Sample1 -d /Users/eidedtul/miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.12/site-packages/hybracter/workflow/../test_data/Plassembler_DB_Test -t 16 -c 50000 --skip_qc --flye_directory hybracter_out/processing/assemblies/Sample1 --depth_filter 0.25 -f 2> hybracter_out/stderr/plassembler_long/Sample1.log
        fi
        touch hybracter_out/processing/plassembler/Sample1/plassembler_plasmids.fasta
        touch hybracter_out/processing/plassembler/Sample1/plassembler_summary.tsv
        plassembler --version > hybracter_out/versions/Sample1/plassembler.version

        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Logfile hybracter_out/stderr/plassembler_long/Sample1.log:
================================================================================
2024-03-21 11:08:40.353 | INFO     | plassembler:begin_plassembler:100 - You are using Plassembler version 1.6.2
2024-03-21 11:08:40.353 | INFO     | plassembler:begin_plassembler:101 - Repository homepage is https://github.com/gbouras13/plassembler
2024-03-21 11:08:40.353 | INFO     | plassembler:begin_plassembler:102 - Written by George Bouras: george.bouras@adelaide.edu.au
2024-03-21 11:08:40.354 | INFO     | plassembler:long:1294 - Database directory is /Users/eidedtul/miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.12/site-packages/hybracter/workflow/../test_data/Plassembler_DB_Test
2024-03-21 11:08:40.354 | INFO     | plassembler:long:1295 - Longreads file is hybracter_out/processing/qc/Sample1_filt_trim.fastq.gz
2024-03-21 11:08:40.354 | INFO     | plassembler:long:1296 - Chromosome length threshold is 50000
2024-03-21 11:08:40.354 | INFO     | plassembler:long:1297 - Output directory is hybracter_out/processing/plassembler/Sample1
2024-03-21 11:08:40.354 | INFO     | plassembler:long:1298 - Min long read length is 500
2024-03-21 11:08:40.354 | INFO     | plassembler:long:1299 - Min long read quality is 9
2024-03-21 11:08:40.354 | INFO     | plassembler:long:1300 - Thread count is 16
2024-03-21 11:08:40.355 | INFO     | plassembler:long:1301 - --force is True
2024-03-21 11:08:40.355 | INFO     | plassembler:long:1302 - --skip_qc is True
2024-03-21 11:08:40.355 | INFO     | plassembler:long:1303 - --raw_flag is False
2024-03-21 11:08:40.355 | INFO     | plassembler:long:1304 - --pacbio_model is nothing
2024-03-21 11:08:40.355 | INFO     | plassembler:long:1305 - --keep_chromosome is False
2024-03-21 11:08:40.355 | INFO     | plassembler:long:1306 - --flye_directory is hybracter_out/processing/assemblies/Sample1
2024-03-21 11:08:40.355 | INFO     | plassembler:long:1307 - --flye_assembly is nothing
2024-03-21 11:08:40.355 | INFO     | plassembler:long:1308 - --flye_info is nothing
2024-03-21 11:08:40.356 | INFO     | plassembler:long:1309 - --corrected_error_rate is 0.12
2024-03-21 11:08:40.356 | INFO     | plassembler:long:1310 - --no_chromosome is False
2024-03-21 11:08:40.356 | INFO     | plassembler:long:1311 - --depth_filter is 0.25
2024-03-21 11:08:40.356 | INFO     | plassembler:long:1312 - --unicycler_options is None
2024-03-21 11:08:40.356 | INFO     | plassembler:long:1313 - --spades_options is None
2024-03-21 11:08:40.356 | INFO     | plassembler:long:1317 - Checking dependencies
2024-03-21 11:08:40.818 | INFO     | plassembler.utils.input_commands:check_dependencies:199 - Flye version found is v2.9.3-b1797.
2024-03-21 11:08:40.819 | INFO     | plassembler.utils.input_commands:check_dependencies:209 - Flye version is ok.
2024-03-21 11:08:40.869 | INFO     | plassembler.utils.input_commands:check_dependencies:218 - Raven v1.8.3 found.
2024-03-21 11:08:40.870 | INFO     | plassembler.utils.input_commands:check_dependencies:220 - Raven version is ok.
2024-03-21 11:08:41.144 | INFO     | plassembler.utils.input_commands:check_dependencies:242 - Unicycler version found is v0.5.0.
2024-03-21 11:08:41.145 | INFO     | plassembler.utils.input_commands:check_dependencies:255 - Unicycler version is ok.
2024-03-21 11:08:41.995 | ERROR    | plassembler.utils.input_commands:check_dependencies:267 - SPAdes not found.
================================================================================

Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2024-03-21T110830.798520.snakemake.log
WorkflowError:
At least one job did not complete successfully.
gbouras13 commented 8 months ago

Maybe try

conda activate /Users/eidedtul/miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.12/site-packages/hybracter/workflow/conda/5039853012fe4edf845348cd60513b78_
spades.py --help

and see what the error is?

George

damientully commented 8 months ago

Thanks George. That seems to work fine but plassembler doesn't seem to be picking it up:

2024-03-21 11:32:31.949 | ERROR | plassembler.utils.input_commands:check_dependencies:267 - SPAdes not found.

$ spades.py --help
SPAdes genome assembler v3.15.2

Usage: spades.py [options] -o <output_dir>

Basic options:
  -o <output_dir>             directory to store all the resulting files (required)
  --isolate                   this flag is highly recommended for high-coverage isolate and multi-cell data
  --sc                        this flag is required for MDA (single-cell) data
  --meta                      this flag is required for metagenomic data
  --bio                       this flag is required for biosyntheticSPAdes mode
  --corona                    this flag is required for coronaSPAdes mode
  --rna                       this flag is required for RNA-Seq data
  --plasmid                   runs plasmidSPAdes pipeline for plasmid detection
  --metaviral                 runs metaviralSPAdes pipeline for virus detection
  --metaplasmid               runs metaplasmidSPAdes pipeline for plasmid detection in metagenomic datasets (equivalent for --meta --plasmid)
  --rnaviral                  this flag enables virus assembly module from RNA-Seq data
  --iontorrent                this flag is required for IonTorrent data
  --test                      runs SPAdes on toy dataset
  -h, --help                  prints this usage message
  -v, --version               prints version

Input data:
  --12 <filename>             file with interlaced forward and reverse paired-end reads
  -1 <filename>               file with forward paired-end reads
  -2 <filename>               file with reverse paired-end reads
  -s <filename>               file with unpaired reads
  --merged <filename>         file with merged forward and reverse paired-end reads
  --pe-12 <#> <filename>      file with interlaced reads for paired-end library number <#>.
                              Older deprecated syntax is -pe<#>-12 <filename>
  --pe-1 <#> <filename>       file with forward reads for paired-end library number <#>.
                              Older deprecated syntax is -pe<#>-1 <filename>
  --pe-2 <#> <filename>       file with reverse reads for paired-end library number <#>.
                              Older deprecated syntax is -pe<#>-2 <filename>
  --pe-s <#> <filename>       file with unpaired reads for paired-end library number <#>.
                              Older deprecated syntax is -pe<#>-s <filename>
  --pe-m <#> <filename>       file with merged reads for paired-end library number <#>.
                              Older deprecated syntax is -pe<#>-m <filename>
  --pe-or <#> <or>            orientation of reads for paired-end library number <#> 
                              (<or> = fr, rf, ff).
                              Older deprecated syntax is -pe<#>-<or>
  --s <#> <filename>          file with unpaired reads for single reads library number <#>.
                              Older deprecated syntax is --s<#> <filename>
  --mp-12 <#> <filename>      file with interlaced reads for mate-pair library number <#>.
                              Older deprecated syntax is -mp<#>-12 <filename>
  --mp-1 <#> <filename>       file with forward reads for mate-pair library number <#>.
                              Older deprecated syntax is -mp<#>-1 <filename>
  --mp-2 <#> <filename>       file with reverse reads for mate-pair library number <#>.
                              Older deprecated syntax is -mp<#>-2 <filename>
  --mp-s <#> <filename>       file with unpaired reads for mate-pair library number <#>.
                              Older deprecated syntax is -mp<#>-s <filename>
  --mp-or <#> <or>            orientation of reads for mate-pair library number <#> 
                              (<or> = fr, rf, ff).
                              Older deprecated syntax is -mp<#>-<or>
  --hqmp-12 <#> <filename>    file with interlaced reads for high-quality mate-pair library number <#>.
                              Older deprecated syntax is -hqmp<#>-12 <filename>
  --hqmp-1 <#> <filename>     file with forward reads for high-quality mate-pair library number <#>.
                              Older deprecated syntax is -hqmp<#>-1 <filename>
  --hqmp-2 <#> <filename>     file with reverse reads for high-quality mate-pair library number <#>.
                              Older deprecated syntax is -hqmp<#>-2 <filename>
  --hqmp-s <#> <filename>     file with unpaired reads for high-quality mate-pair library number <#>.
                              Older deprecated syntax is -hqmp<#>-s <filename>
  --hqmp-or <#> <or>          orientation of reads for high-quality mate-pair library number <#> 
                              (<or> = fr, rf, ff).
                              Older deprecated syntax is -hqmp<#>-<or>
  --sanger <filename>         file with Sanger reads
  --pacbio <filename>         file with PacBio reads
  --nanopore <filename>       file with Nanopore reads
  --trusted-contigs <filename>
                              file with trusted contigs
  --untrusted-contigs <filename>
                              file with untrusted contigs

Pipeline options:
  --only-error-correction     runs only read error correction (without assembling)
  --only-assembler            runs only assembling (without read error correction)
  --careful                   tries to reduce number of mismatches and short indels
  --checkpoints <last or all>
                              save intermediate check-points ('last', 'all')
  --continue                  continue run from the last available check-point (only -o should be specified)
  --restart-from <cp>         restart run with updated options and from the specified check-point
                              ('ec', 'as', 'k<int>', 'mc', 'last')
  --disable-gzip-output       forces error correction not to compress the corrected reads
  --disable-rr                disables repeat resolution stage of assembling

Advanced options:
  --dataset <filename>        file with dataset description in YAML format
  -t <int>, --threads <int>   number of threads. [default: 16]
  -m <int>, --memory <int>    RAM limit for SPAdes in Gb (terminates if exceeded). [default: 250]
  --tmp-dir <dirname>         directory for temporary files. [default: <output_dir>/tmp]
  -k <int> [<int> ...]        list of k-mer sizes (must be odd and less than 128)
                              [default: 'auto']
  --cov-cutoff <float>        coverage cutoff value (a positive float number, or 'auto', or 'off')
                              [default: 'off']
  --phred-offset <33 or 64>   PHRED quality offset in the input reads (33 or 64),
                              [default: auto-detect]
  --custom-hmms <dirname>     directory with custom hmms that replace default ones,
                              [default: None]
gbouras13 commented 8 months ago

That is very strange - maybe try spades.py --version

damientully commented 8 months ago

spades.py --version SPAdes genome assembler v3.15.2

gbouras13 commented 8 months ago

Ok that is bizarre!

Maybe it is a python issue.

In that case, I'd try a fresh hybracter install with a version that isn't 3.12

e.g.

mamba install -n hybracterENV hybracter python=3.9

George

damientully commented 8 months ago

Thanks George. I installed with python 3.9 this time and got the following:

Finished job 11. 9 of 29 steps (31%) done

[Fri Mar 22 10:40:34 2024]
checkpoint check_completeness:
    input: hybracter_out/processing/assemblies/Sample2/assembly.fasta, hybracter_out/processing/assemblies/Sample2/assembly_info.txt
    output: hybracter_out/completeness/Sample2.txt
    jobid: 16
    reason: Missing output files: hybracter_out/completeness/Sample2.txt; Input files updated by another job: hybracter_out/processing/assemblies/Sample2/assembly_info.txt, hybracter_out/processing/assemblies/Sample2/assembly.fasta
    wildcards: sample=Sample2
    resources: tmpdir=/var/folders/v6/3dzc71_x309c1sxz_sghh9_h0000gp/T, mem_mb=4000, mem_mib=3815, mem=4000MB, time=00:00:05
DAG of jobs will be updated after completion.

[Fri Mar 22 10:40:34 2024]
rule assemble:
    input: hybracter_out/processing/qc/Sample1_filt_trim.fastq.gz
    output: hybracter_out/processing/assemblies/Sample1/assembly.fasta, hybracter_out/processing/assemblies/Sample1/assembly_info.txt, hybracter_out/versions/Sample1/flye.version, hybracter_out/supplementary_results/flye_individual_summaries/Sample1_assembly_info.txt
    log: hybracter_out/stderr/assemble/Sample1.log
    jobid: 10
    benchmark: hybracter_out/benchmarks/assemble/Sample1.txt
    reason: Missing output files: hybracter_out/processing/assemblies/Sample1/assembly.fasta, hybracter_out/processing/assemblies/Sample1/assembly_info.txt, hybracter_out/versions/Sample1/flye.version, hybracter_out/supplementary_results/flye_individual_summaries/Sample1_assembly_info.txt; Input files updated by another job: hybracter_out/processing/qc/Sample1_filt_trim.fastq.gz
    wildcards: sample=Sample1
    threads: 16
    resources: tmpdir=/var/folders/v6/3dzc71_x309c1sxz_sghh9_h0000gp/T, mem_mb=32000, mem_mib=30518, mem=32000MB, time=08:00:00

        flye --nano-hq hybracter_out/processing/qc/Sample1_filt_trim.fastq.gz -t 16  --out-dir hybracter_out/processing/assemblies/Sample1 2> hybracter_out/stderr/assemble/Sample1.log
        flye --version > hybracter_out/versions/Sample1/flye.version
        cp hybracter_out/processing/assemblies/Sample1/assembly_info.txt hybracter_out/supplementary_results/flye_individual_summaries/Sample1_assembly_info.txt

Activating conda environment: miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.9/site-packages/hybracter/workflow/conda/a0d274bbd7d5a9b16721a797711e2ca1_
python -c "from __future__ import print_function; import sys, json; print(json.dumps([sys.version_info.major, sys.version_info.minor]))"
Activating conda environment: miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.9/site-packages/hybracter/workflow/conda/20e890568ebb146b6a975de6e851d22f_
Environment defines Python version < 3.7. Using Python of the main process to execute script. Note that this cannot be avoided, because the script uses data structures from Snakemake which are Python >=3.7 only.
/Users/eidedtul/miniforge3/envs/base_osx-64/envs/hybracterENV/bin/python3.9 /Users/eidedtul/.snakemake/scripts/tmpo9rcklf6.check_completeness.py
Activating conda environment: miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.9/site-packages/hybracter/workflow/conda/20e890568ebb146b6a975de6e851d22f_
Traceback (most recent call last):
  File "/Users/eidedtul/.snakemake/scripts/tmpo9rcklf6.check_completeness.py", line 7, in <module>
    import pandas as pd
ModuleNotFoundError: No module named 'pandas'
[Fri Mar 22 10:40:48 2024]
Error in rule check_completeness:
    jobid: 16
    input: hybracter_out/processing/assemblies/Sample2/assembly.fasta, hybracter_out/processing/assemblies/Sample2/assembly_info.txt
    output: hybracter_out/completeness/Sample2.txt
    conda-env: /Users/eidedtul/miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.9/site-packages/hybracter/workflow/conda/20e890568ebb146b6a975de6e851d22f_

RuleException:
CalledProcessError in file /Users/eidedtul/miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.9/site-packages/hybracter/workflow/rules/processing/extract_fastas.smk, line 21:
Command 'source /Users/eidedtul/miniforge3/envs/base_osx-64/bin/activate '/Users/eidedtul/miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.9/site-packages/hybracter/workflow/conda/20e890568ebb146b6a975de6e851d22f_'; set -euo pipefail;  /Users/eidedtul/miniforge3/envs/base_osx-64/envs/hybracterENV/bin/python3.9 /Users/eidedtul/.snakemake/scripts/tmpo9rcklf6.check_completeness.py' returned non-zero exit status 1.
  File "/Users/eidedtul/miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.9/site-packages/hybracter/workflow/rules/processing/extract_fastas.smk", line 21, in __rule_check_completeness
  File "/Users/eidedtul/miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.9/concurrent/futures/thread.py", line 52, in run
[Fri Mar 22 10:41:27 2024]
Finished job 10.
10 of 29 steps (34%) done
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
cat .snakemake/log/2024-03-22T103535.073448.snakemake.log >> hybracter_out/hybracter.log
Complete log: .snakemake/log/2024-03-22T103535.073448.snakemake.log
[2024:03:22 10:41:28] ERROR: Snakemake failed
gbouras13 commented 8 months ago

Hi @damientully ,

It looks like you are running into some strange conda/mamba issues - it is weird that all these environments fail as such.

I'm not really sure what to do to help you solve this short of re-installing conda (or trying a Linux machine).

I am unsure if you have a few or a lot of samples to assemble, but one option if you only have a few may be a colab notebook - if you are interested, let me know and I'll see what I can do.

George