gbouras13 / hybracter

Automated long-read first bacterial genome assembly tool implemented in Snakemake using Snaketool.
MIT License
108 stars 8 forks source link

Unicycler not found error #102

Open schmittel opened 1 month ago

schmittel commented 1 month ago

Hi, thanks for this software. I used it on a directory of Nanopore fastq files and it worked beautifully. But when running on a directory of PacBio fastq files I keep getting an error. This error is a variation on what has been reported before but so far I'm unable to overcome it.

Here's the error:

Activating conda environment: ../../../code/conda_envs/hybracterENV/lib/python3.12/site-packages/hybracter/workflow/conda/308674b41518b3abe7fb1010240eff8c_
[Thu Oct 17 16:43:56 2024]
Error in rule plassembler_long:
    jobid: 171
    input: /output/processing/kmc/JM109/JM109_kmcLOG.txt, /output/processing/qc/JM109_filt_trim.fastq.gz
    output: /output/processing/plassembler/JM109/plassembler_plasmids.fasta, /output/processing/plassembler/JM109/plassembler_summary.tsv, /output/versions/JM109/plassembler.version
    log: /output/stderr/plassembler_long/JM109.log (check log file(s) for error details)
    conda-env: /code/conda_envs/hybracterENV/lib/python3.12/site-packages/hybracter/workflow/conda/308674b41518b3abe7fb1010240eff8c_
    shell:

        plassembler long -l /output/processing/qc/JM109_filt_trim.fastq.gz -o /output/processing/plassembler/JM109 -d /code/conda_envs/hybracterENV/lib/python3.12/site-packages/hybracter/workflow/../databases -t 16 -c 4000000 --skip_qc --flye_directory /output/processing/assemblies/JM109 --depth_filter 0.25 -f 2> /output/stderr/plassembler_long/JM109.log
        touch /output/processing/plassembler/JM109/plassembler_plasmids.fasta
        touch /output/processing/plassembler/JM109/plassembler_summary.tsv
        plassembler --version > /output/versions/JM109/plassembler.version

        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Logfile /output/stderr/plassembler_long/JM109.log:
================================================================================
2024-10-17 16:43:56.623 | INFO     | plassembler:begin_plassembler:100 - You are using Plassembler version 1.6.2
2024-10-17 16:43:56.623 | INFO     | plassembler:begin_plassembler:101 - Repository homepage is https://github.com/gbouras13/plassembler
2024-10-17 16:43:56.623 | INFO     | plassembler:begin_plassembler:102 - Written by George Bouras: george.bouras@adelaide.edu.au
2024-10-17 16:43:56.623 | INFO     | plassembler:long:1294 - Database directory is /code/conda_envs/hybracterENV/lib/python3.12/site-packages/hybracter/workflow/../databases
2024-10-17 16:43:56.623 | INFO     | plassembler:long:1295 - Longreads file is /output/processing/qc/JM109_filt_trim.fastq.gz
2024-10-17 16:43:56.623 | INFO     | plassembler:long:1296 - Chromosome length threshold is 4000000
2024-10-17 16:43:56.623 | INFO     | plassembler:long:1297 - Output directory is /output/processing/plassembler/JM109
2024-10-17 16:43:56.623 | INFO     | plassembler:long:1298 - Min long read length is 500
2024-10-17 16:43:56.623 | INFO     | plassembler:long:1299 - Min long read quality is 9
2024-10-17 16:43:56.623 | INFO     | plassembler:long:1300 - Thread count is 16
2024-10-17 16:43:56.623 | INFO     | plassembler:long:1301 - --force is True
2024-10-17 16:43:56.624 | INFO     | plassembler:long:1302 - --skip_qc is True
2024-10-17 16:43:56.624 | INFO     | plassembler:long:1303 - --raw_flag is False
2024-10-17 16:43:56.624 | INFO     | plassembler:long:1304 - --pacbio_model is nothing
2024-10-17 16:43:56.624 | INFO     | plassembler:long:1305 - --keep_chromosome is False
2024-10-17 16:43:56.624 | INFO     | plassembler:long:1306 - --flye_directory is /output/processing/assemblies/JM109
2024-10-17 16:43:56.624 | INFO     | plassembler:long:1307 - --flye_assembly is nothing
2024-10-17 16:43:56.624 | INFO     | plassembler:long:1308 - --flye_info is nothing
2024-10-17 16:43:56.624 | INFO     | plassembler:long:1309 - --corrected_error_rate is 0.12
2024-10-17 16:43:56.624 | INFO     | plassembler:long:1310 - --no_chromosome is False
2024-10-17 16:43:56.624 | INFO     | plassembler:long:1311 - --depth_filter is 0.25
2024-10-17 16:43:56.624 | INFO     | plassembler:long:1312 - --unicycler_options is None
2024-10-17 16:43:56.624 | INFO     | plassembler:long:1313 - --spades_options is None
2024-10-17 16:43:56.624 | INFO     | plassembler:long:1317 - Checking dependencies
2024-10-17 16:43:56.697 | INFO     | plassembler.utils.input_commands:check_dependencies:199 - Flye version found is v2.9.5-b1801.
2024-10-17 16:43:56.698 | INFO     | plassembler.utils.input_commands:check_dependencies:209 - Flye version is ok.
2024-10-17 16:43:56.705 | INFO     | plassembler.utils.input_commands:check_dependencies:218 - Raven v1.8.3 found.
2024-10-17 16:43:56.705 | INFO     | plassembler.utils.input_commands:check_dependencies:220 - Raven version is ok.
2024-10-17 16:43:56.772 | ERROR    | plassembler.utils.input_commands:check_dependencies:239 - Unicycler not found. Please re-install Unicycler, see instructions at https://github.com/gbouras13/plassembler.
================================================================================

Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2024-10-17T162626.887003.snakemake.log
WorkflowError:
At least one job did not complete successfully.
[2024:10:17 16:43:56] ERROR: Snakemake failed

However, when activate the /code/conda_envs/hybracterENV/lib/python3.12/site-packages/hybracter/workflow/conda/308674b41518b3abe7fb1010240eff8c_ conda environment, Unicycler is most definitely installed. Curiously, when I directly run the command from the terminal:

plassembler long -l /output/processing/qc/JM109_filt_trim.fastq.gz -o /output/processing/plassembler/JM109 -d /code/conda_envs/hybracterENV/lib/python3.12/site-packages/hybracter/workflow/../databases -t 16 -c 4000000 --skip_qc --flye_directory /output/processing/assemblies/JM109 --depth_filter 0.25 -f 2> /output/stderr/plassembler_long/JM109.log
        touch /output/processing/plassembler/JM109/plassembler_plasmids.fasta
        touch /output/processing/plassembler/JM109/plassembler_summary.tsv
        plassembler --version > /output/versions/JM109/plassembler.version

it appears to complete without error. I tried editing the plasassembler.yaml file as suggested here but this did not solve the problem. I installed everything according to the instructions and as I said, everything worked beautifully the first time. I'm just not sure how to continue from here and any help will be appreciated.

Many thanks!

gbouras13 commented 1 month ago

This is ultra strange @schmittel - you have done everything I would suggest to check. Therefore, I am not really sure why hybracter would fail (especially given you had it working already!) - swapping to pacbio shouldn't affect unicycler's installation.

My question is, what version of hybracter is this? Did you try v0.9.1? I have updated the plassembler yaml as unicycler's conda installation is much streamlined now. So maybe try this out (or perhaps it is the cause of the issue!).

Another option if everything still breaks is to use the prebuilt container - see the notes on the REAME.md. I usually run Hybracter myself with the container these days to be honest.

George