Open damientully opened 8 months ago
Hi @damientully ,
Looks like the error is probably with plassembler_long (the rule above) which cascades down to Sedaka - based on the log, the environment doesn't have spades
Please try:
conda activate /Users/eidedtul/miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.12/site-packages/hybracter/workflow/conda/5039853012fe4edf845348cd60513b78_
mamba install bioconda::spades
conda deactivate
If you still get a Medaka error after that, please try
conda activate /Users/eidedtul/miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.12/site-packages/hybracter/workflow/conda/3a238a896824eb2007e785c8a56e5932_
mamba uninstall medaka
mamba install medaka
conda deactivate
George
Thanks @gbouras13
Funny I have tried this:
conda activate /Users/eidedtul/miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.12/site-packages/hybracter/workflow/conda/5039853012fe4edf845348cd60513b78_
mamba install bioconda::spades
conda deactivate
but I still get an error with spades:
Error in rule plassembler_long:
jobid: 43
input: hybracter_out/processing/qc/Sample1_filt_trim.fastq.gz
output: hybracter_out/processing/plassembler/Sample1/plassembler_plasmids.fasta, hybracter_out/processing/plassembler/Sample1/plassembler_summary.tsv, hybracter_out/versions/Sample1/plassembler.version
log: hybracter_out/stderr/plassembler_long/Sample1.log (check log file(s) for error details)
conda-env: /Users/eidedtul/miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.12/site-packages/hybracter/workflow/conda/5039853012fe4edf845348cd60513b78_
shell:
if unicycler --version ; then
plassembler long -l hybracter_out/processing/qc/Sample1_filt_trim.fastq.gz -o hybracter_out/processing/plassembler/Sample1 -d /Users/eidedtul/miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.12/site-packages/hybracter/workflow/../test_data/Plassembler_DB_Test -t 16 -c 50000 --skip_qc --flye_directory hybracter_out/processing/assemblies/Sample1 --depth_filter 0.25 -f 2> hybracter_out/stderr/plassembler_long/Sample1.log
else
pip install git+https://github.com/rrwick/Unicycler.git
plassembler long -l hybracter_out/processing/qc/Sample1_filt_trim.fastq.gz -o hybracter_out/processing/plassembler/Sample1 -d /Users/eidedtul/miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.12/site-packages/hybracter/workflow/../test_data/Plassembler_DB_Test -t 16 -c 50000 --skip_qc --flye_directory hybracter_out/processing/assemblies/Sample1 --depth_filter 0.25 -f 2> hybracter_out/stderr/plassembler_long/Sample1.log
fi
touch hybracter_out/processing/plassembler/Sample1/plassembler_plasmids.fasta
touch hybracter_out/processing/plassembler/Sample1/plassembler_summary.tsv
plassembler --version > hybracter_out/versions/Sample1/plassembler.version
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Logfile hybracter_out/stderr/plassembler_long/Sample1.log:
================================================================================
2024-03-21 11:08:40.353 | INFO | plassembler:begin_plassembler:100 - You are using Plassembler version 1.6.2
2024-03-21 11:08:40.353 | INFO | plassembler:begin_plassembler:101 - Repository homepage is https://github.com/gbouras13/plassembler
2024-03-21 11:08:40.353 | INFO | plassembler:begin_plassembler:102 - Written by George Bouras: george.bouras@adelaide.edu.au
2024-03-21 11:08:40.354 | INFO | plassembler:long:1294 - Database directory is /Users/eidedtul/miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.12/site-packages/hybracter/workflow/../test_data/Plassembler_DB_Test
2024-03-21 11:08:40.354 | INFO | plassembler:long:1295 - Longreads file is hybracter_out/processing/qc/Sample1_filt_trim.fastq.gz
2024-03-21 11:08:40.354 | INFO | plassembler:long:1296 - Chromosome length threshold is 50000
2024-03-21 11:08:40.354 | INFO | plassembler:long:1297 - Output directory is hybracter_out/processing/plassembler/Sample1
2024-03-21 11:08:40.354 | INFO | plassembler:long:1298 - Min long read length is 500
2024-03-21 11:08:40.354 | INFO | plassembler:long:1299 - Min long read quality is 9
2024-03-21 11:08:40.354 | INFO | plassembler:long:1300 - Thread count is 16
2024-03-21 11:08:40.355 | INFO | plassembler:long:1301 - --force is True
2024-03-21 11:08:40.355 | INFO | plassembler:long:1302 - --skip_qc is True
2024-03-21 11:08:40.355 | INFO | plassembler:long:1303 - --raw_flag is False
2024-03-21 11:08:40.355 | INFO | plassembler:long:1304 - --pacbio_model is nothing
2024-03-21 11:08:40.355 | INFO | plassembler:long:1305 - --keep_chromosome is False
2024-03-21 11:08:40.355 | INFO | plassembler:long:1306 - --flye_directory is hybracter_out/processing/assemblies/Sample1
2024-03-21 11:08:40.355 | INFO | plassembler:long:1307 - --flye_assembly is nothing
2024-03-21 11:08:40.355 | INFO | plassembler:long:1308 - --flye_info is nothing
2024-03-21 11:08:40.356 | INFO | plassembler:long:1309 - --corrected_error_rate is 0.12
2024-03-21 11:08:40.356 | INFO | plassembler:long:1310 - --no_chromosome is False
2024-03-21 11:08:40.356 | INFO | plassembler:long:1311 - --depth_filter is 0.25
2024-03-21 11:08:40.356 | INFO | plassembler:long:1312 - --unicycler_options is None
2024-03-21 11:08:40.356 | INFO | plassembler:long:1313 - --spades_options is None
2024-03-21 11:08:40.356 | INFO | plassembler:long:1317 - Checking dependencies
2024-03-21 11:08:40.818 | INFO | plassembler.utils.input_commands:check_dependencies:199 - Flye version found is v2.9.3-b1797.
2024-03-21 11:08:40.819 | INFO | plassembler.utils.input_commands:check_dependencies:209 - Flye version is ok.
2024-03-21 11:08:40.869 | INFO | plassembler.utils.input_commands:check_dependencies:218 - Raven v1.8.3 found.
2024-03-21 11:08:40.870 | INFO | plassembler.utils.input_commands:check_dependencies:220 - Raven version is ok.
2024-03-21 11:08:41.144 | INFO | plassembler.utils.input_commands:check_dependencies:242 - Unicycler version found is v0.5.0.
2024-03-21 11:08:41.145 | INFO | plassembler.utils.input_commands:check_dependencies:255 - Unicycler version is ok.
2024-03-21 11:08:41.995 | ERROR | plassembler.utils.input_commands:check_dependencies:267 - SPAdes not found.
================================================================================
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2024-03-21T110830.798520.snakemake.log
WorkflowError:
At least one job did not complete successfully.
Maybe try
conda activate /Users/eidedtul/miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.12/site-packages/hybracter/workflow/conda/5039853012fe4edf845348cd60513b78_
spades.py --help
and see what the error is?
George
Thanks George. That seems to work fine but plassembler doesn't seem to be picking it up:
2024-03-21 11:32:31.949 | ERROR | plassembler.utils.input_commands:check_dependencies:267 - SPAdes not found.
$ spades.py --help
SPAdes genome assembler v3.15.2
Usage: spades.py [options] -o <output_dir>
Basic options:
-o <output_dir> directory to store all the resulting files (required)
--isolate this flag is highly recommended for high-coverage isolate and multi-cell data
--sc this flag is required for MDA (single-cell) data
--meta this flag is required for metagenomic data
--bio this flag is required for biosyntheticSPAdes mode
--corona this flag is required for coronaSPAdes mode
--rna this flag is required for RNA-Seq data
--plasmid runs plasmidSPAdes pipeline for plasmid detection
--metaviral runs metaviralSPAdes pipeline for virus detection
--metaplasmid runs metaplasmidSPAdes pipeline for plasmid detection in metagenomic datasets (equivalent for --meta --plasmid)
--rnaviral this flag enables virus assembly module from RNA-Seq data
--iontorrent this flag is required for IonTorrent data
--test runs SPAdes on toy dataset
-h, --help prints this usage message
-v, --version prints version
Input data:
--12 <filename> file with interlaced forward and reverse paired-end reads
-1 <filename> file with forward paired-end reads
-2 <filename> file with reverse paired-end reads
-s <filename> file with unpaired reads
--merged <filename> file with merged forward and reverse paired-end reads
--pe-12 <#> <filename> file with interlaced reads for paired-end library number <#>.
Older deprecated syntax is -pe<#>-12 <filename>
--pe-1 <#> <filename> file with forward reads for paired-end library number <#>.
Older deprecated syntax is -pe<#>-1 <filename>
--pe-2 <#> <filename> file with reverse reads for paired-end library number <#>.
Older deprecated syntax is -pe<#>-2 <filename>
--pe-s <#> <filename> file with unpaired reads for paired-end library number <#>.
Older deprecated syntax is -pe<#>-s <filename>
--pe-m <#> <filename> file with merged reads for paired-end library number <#>.
Older deprecated syntax is -pe<#>-m <filename>
--pe-or <#> <or> orientation of reads for paired-end library number <#>
(<or> = fr, rf, ff).
Older deprecated syntax is -pe<#>-<or>
--s <#> <filename> file with unpaired reads for single reads library number <#>.
Older deprecated syntax is --s<#> <filename>
--mp-12 <#> <filename> file with interlaced reads for mate-pair library number <#>.
Older deprecated syntax is -mp<#>-12 <filename>
--mp-1 <#> <filename> file with forward reads for mate-pair library number <#>.
Older deprecated syntax is -mp<#>-1 <filename>
--mp-2 <#> <filename> file with reverse reads for mate-pair library number <#>.
Older deprecated syntax is -mp<#>-2 <filename>
--mp-s <#> <filename> file with unpaired reads for mate-pair library number <#>.
Older deprecated syntax is -mp<#>-s <filename>
--mp-or <#> <or> orientation of reads for mate-pair library number <#>
(<or> = fr, rf, ff).
Older deprecated syntax is -mp<#>-<or>
--hqmp-12 <#> <filename> file with interlaced reads for high-quality mate-pair library number <#>.
Older deprecated syntax is -hqmp<#>-12 <filename>
--hqmp-1 <#> <filename> file with forward reads for high-quality mate-pair library number <#>.
Older deprecated syntax is -hqmp<#>-1 <filename>
--hqmp-2 <#> <filename> file with reverse reads for high-quality mate-pair library number <#>.
Older deprecated syntax is -hqmp<#>-2 <filename>
--hqmp-s <#> <filename> file with unpaired reads for high-quality mate-pair library number <#>.
Older deprecated syntax is -hqmp<#>-s <filename>
--hqmp-or <#> <or> orientation of reads for high-quality mate-pair library number <#>
(<or> = fr, rf, ff).
Older deprecated syntax is -hqmp<#>-<or>
--sanger <filename> file with Sanger reads
--pacbio <filename> file with PacBio reads
--nanopore <filename> file with Nanopore reads
--trusted-contigs <filename>
file with trusted contigs
--untrusted-contigs <filename>
file with untrusted contigs
Pipeline options:
--only-error-correction runs only read error correction (without assembling)
--only-assembler runs only assembling (without read error correction)
--careful tries to reduce number of mismatches and short indels
--checkpoints <last or all>
save intermediate check-points ('last', 'all')
--continue continue run from the last available check-point (only -o should be specified)
--restart-from <cp> restart run with updated options and from the specified check-point
('ec', 'as', 'k<int>', 'mc', 'last')
--disable-gzip-output forces error correction not to compress the corrected reads
--disable-rr disables repeat resolution stage of assembling
Advanced options:
--dataset <filename> file with dataset description in YAML format
-t <int>, --threads <int> number of threads. [default: 16]
-m <int>, --memory <int> RAM limit for SPAdes in Gb (terminates if exceeded). [default: 250]
--tmp-dir <dirname> directory for temporary files. [default: <output_dir>/tmp]
-k <int> [<int> ...] list of k-mer sizes (must be odd and less than 128)
[default: 'auto']
--cov-cutoff <float> coverage cutoff value (a positive float number, or 'auto', or 'off')
[default: 'off']
--phred-offset <33 or 64> PHRED quality offset in the input reads (33 or 64),
[default: auto-detect]
--custom-hmms <dirname> directory with custom hmms that replace default ones,
[default: None]
That is very strange - maybe try spades.py --version
spades.py --version SPAdes genome assembler v3.15.2
Ok that is bizarre!
Maybe it is a python issue.
In that case, I'd try a fresh hybracter install with a version that isn't 3.12
e.g.
mamba install -n hybracterENV hybracter python=3.9
George
Thanks George. I installed with python 3.9 this time and got the following:
Finished job 11. 9 of 29 steps (31%) done
[Fri Mar 22 10:40:34 2024]
checkpoint check_completeness:
input: hybracter_out/processing/assemblies/Sample2/assembly.fasta, hybracter_out/processing/assemblies/Sample2/assembly_info.txt
output: hybracter_out/completeness/Sample2.txt
jobid: 16
reason: Missing output files: hybracter_out/completeness/Sample2.txt; Input files updated by another job: hybracter_out/processing/assemblies/Sample2/assembly_info.txt, hybracter_out/processing/assemblies/Sample2/assembly.fasta
wildcards: sample=Sample2
resources: tmpdir=/var/folders/v6/3dzc71_x309c1sxz_sghh9_h0000gp/T, mem_mb=4000, mem_mib=3815, mem=4000MB, time=00:00:05
DAG of jobs will be updated after completion.
[Fri Mar 22 10:40:34 2024]
rule assemble:
input: hybracter_out/processing/qc/Sample1_filt_trim.fastq.gz
output: hybracter_out/processing/assemblies/Sample1/assembly.fasta, hybracter_out/processing/assemblies/Sample1/assembly_info.txt, hybracter_out/versions/Sample1/flye.version, hybracter_out/supplementary_results/flye_individual_summaries/Sample1_assembly_info.txt
log: hybracter_out/stderr/assemble/Sample1.log
jobid: 10
benchmark: hybracter_out/benchmarks/assemble/Sample1.txt
reason: Missing output files: hybracter_out/processing/assemblies/Sample1/assembly.fasta, hybracter_out/processing/assemblies/Sample1/assembly_info.txt, hybracter_out/versions/Sample1/flye.version, hybracter_out/supplementary_results/flye_individual_summaries/Sample1_assembly_info.txt; Input files updated by another job: hybracter_out/processing/qc/Sample1_filt_trim.fastq.gz
wildcards: sample=Sample1
threads: 16
resources: tmpdir=/var/folders/v6/3dzc71_x309c1sxz_sghh9_h0000gp/T, mem_mb=32000, mem_mib=30518, mem=32000MB, time=08:00:00
flye --nano-hq hybracter_out/processing/qc/Sample1_filt_trim.fastq.gz -t 16 --out-dir hybracter_out/processing/assemblies/Sample1 2> hybracter_out/stderr/assemble/Sample1.log
flye --version > hybracter_out/versions/Sample1/flye.version
cp hybracter_out/processing/assemblies/Sample1/assembly_info.txt hybracter_out/supplementary_results/flye_individual_summaries/Sample1_assembly_info.txt
Activating conda environment: miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.9/site-packages/hybracter/workflow/conda/a0d274bbd7d5a9b16721a797711e2ca1_
python -c "from __future__ import print_function; import sys, json; print(json.dumps([sys.version_info.major, sys.version_info.minor]))"
Activating conda environment: miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.9/site-packages/hybracter/workflow/conda/20e890568ebb146b6a975de6e851d22f_
Environment defines Python version < 3.7. Using Python of the main process to execute script. Note that this cannot be avoided, because the script uses data structures from Snakemake which are Python >=3.7 only.
/Users/eidedtul/miniforge3/envs/base_osx-64/envs/hybracterENV/bin/python3.9 /Users/eidedtul/.snakemake/scripts/tmpo9rcklf6.check_completeness.py
Activating conda environment: miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.9/site-packages/hybracter/workflow/conda/20e890568ebb146b6a975de6e851d22f_
Traceback (most recent call last):
File "/Users/eidedtul/.snakemake/scripts/tmpo9rcklf6.check_completeness.py", line 7, in <module>
import pandas as pd
ModuleNotFoundError: No module named 'pandas'
[Fri Mar 22 10:40:48 2024]
Error in rule check_completeness:
jobid: 16
input: hybracter_out/processing/assemblies/Sample2/assembly.fasta, hybracter_out/processing/assemblies/Sample2/assembly_info.txt
output: hybracter_out/completeness/Sample2.txt
conda-env: /Users/eidedtul/miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.9/site-packages/hybracter/workflow/conda/20e890568ebb146b6a975de6e851d22f_
RuleException:
CalledProcessError in file /Users/eidedtul/miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.9/site-packages/hybracter/workflow/rules/processing/extract_fastas.smk, line 21:
Command 'source /Users/eidedtul/miniforge3/envs/base_osx-64/bin/activate '/Users/eidedtul/miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.9/site-packages/hybracter/workflow/conda/20e890568ebb146b6a975de6e851d22f_'; set -euo pipefail; /Users/eidedtul/miniforge3/envs/base_osx-64/envs/hybracterENV/bin/python3.9 /Users/eidedtul/.snakemake/scripts/tmpo9rcklf6.check_completeness.py' returned non-zero exit status 1.
File "/Users/eidedtul/miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.9/site-packages/hybracter/workflow/rules/processing/extract_fastas.smk", line 21, in __rule_check_completeness
File "/Users/eidedtul/miniforge3/envs/base_osx-64/envs/hybracterENV/lib/python3.9/concurrent/futures/thread.py", line 52, in run
[Fri Mar 22 10:41:27 2024]
Finished job 10.
10 of 29 steps (34%) done
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
cat .snakemake/log/2024-03-22T103535.073448.snakemake.log >> hybracter_out/hybracter.log
Complete log: .snakemake/log/2024-03-22T103535.073448.snakemake.log
[2024:03:22 10:41:28] ERROR: Snakemake failed
Hi @damientully ,
It looks like you are running into some strange conda/mamba issues - it is weird that all these environments fail as such.
I'm not really sure what to do to help you solve this short of re-installing conda (or trying a Linux machine).
I am unsure if you have a few or a lot of samples to assemble, but one option if you only have a few may be a colab notebook - if you are interested, let me know and I'll see what I can do.
George
Hi,
I appear to be getting the following error when running the following command hybracter test-long which I presume is related to medaka. I wondered if you have encountered this issue before and whether you may have a solution?
Many thanks Damien