nanoporetech / medaka

Sequence correction provided by ONT Research
https://nanoporetech.com
Other
391 stars 73 forks source link

Error running medaka_consensus #445

Closed RayanaFeltrin closed 11 months ago

RayanaFeltrin commented 11 months ago

Hello!

I have installed medaka 1.8.1 using a virtual environment, as well as its dependencies, on an Ubuntu 20.04.4 LTS with no GPU. Then I have run the script below, following the model suggested in README:

#!/bin/bash

source medaka/bin/activate
NPROC=$(nproc)
BASECALLS=/home/rayana/Documents/Genomes_Nanopore/6set2022_4genomas/20220906_1853_MN19475_FAK88705_0a2345f8/fast5_guppy/pass/barcode11/*fastq
DRAFT=/home/rayana/Documents/Genomes_ONT_only/AM1001_ONT_contigs.fasta
OUTDIR=medaka_consensus

medaka_consensus -i ${BASECALLS} -d ${DRAFT} -o ${OUTDIR} -t ${NPROC} -m r104_min_sup_g642

So I got the following message:

medaka 1.8.1

Assembly polishing via neural networks. Medaka is optimized to work with the Flye assembler.

medaka_consensus [-h] -i -d

-h  show this help text.
-i  fastx input basecalls (required).
-d  fasta input assembly (required).
-o  output folder (default: medaka).
-g  don't fill gaps in consensus with draft sequence.
-r  use gap-filling character instead of draft sequence (default: None)
-m  medaka model, (default: r1041_e82_400bps_sup_v4.2.0).
    Choices: r103_fast_g507 r103_hac_g507 r103_min_high_g345 r103_min_high_g360 r103_prom_high_g360 r103_sup_g507 r1041_e82_260bps_fast_g632 r1041_e82_260bps_hac_g632 r1041_e82_260bps_hac_v4.0.0 r1041_e82_260bps_hac_v4.1.0 r1041_e82_260bps_sup_g632 r1041_e82_260bps_sup_v4.0.0 r1041_e82_260bps_sup_v4.1.0 r1041_e82_400bps_fast_g615 r1041_e82_400bps_fast_g632 r1041_e82_400bps_hac_g615 r1041_e82_400bps_hac_g632 r1041_e82_400bps_hac_v4.0.0 r1041_e82_400bps_hac_v4.1.0 r1041_e82_400bps_hac_v4.2.0 r1041_e82_400bps_sup_g615 r1041_e82_400bps_sup_v4.0.0 r1041_e82_400bps_sup_v4.1.0 r1041_e82_400bps_sup_v4.2.0 r104_e81_fast_g5015 r104_e81_hac_g5015 r104_e81_sup_g5015 r104_e81_sup_g610 r10_min_high_g303 r10_min_high_g340 r941_e81_fast_g514 r941_e81_hac_g514 r941_e81_sup_g514 r941_min_fast_g303 r941_min_fast_g507 r941_min_hac_g507 r941_min_high_g303 r941_min_high_g330 r941_min_high_g340_rle r941_min_high_g344 r941_min_high_g351 r941_min_high_g360 r941_min_sup_g507 r941_prom_fast_g303 r941_prom_fast_g507 r941_prom_hac_g507 r941_prom_high_g303 r941_prom_high_g330 r941_prom_high_g344 r941_prom_high_g360 r941_prom_high_g4011 r941_prom_sup_g507 r941_sup_plant_g610
    Alternatively a .tar.gz/.hdf file from 'medaka train'.
-f  Force overwrite of outputs (default will reuse existing outputs).
-x  Force recreation of alignment index.
-t  number of threads with which to create features (default: 1).
-b  batchsize, controls memory use (default: 100).
-q  Output consensus with per-base quality scores (fastq).

-d must be specified.

After that, I have double checked the paths of BASECALLS and DRAFT and everything seems correct. However, I have realized that the model I chose (r104_min_sup_g642 => flowcell 10.4, MinION, Guppy 6.4.2 run in sup mode) is not available on the model list. So I randomly chose any model to run in order to check if this was the problem, but I've got the same error message. Is there anything I can do to try solving this problem? What model do you suggest for me to use?

Thank you in advance.

RayanaFeltrin commented 11 months ago

Hello!

I have solved part of the issue running the command inside the script directly in the terminal, like this: medaka_consensus -i /home/rayana/Documents/Genomes_Nanopore/6set2022_4genomas/20220906_1853_MN19475_FAK88705_0a2345f8/fast5_guppy/pass/barcode11/fastq_runid_353c5d5a3a4274b1ccc980f2669c44b1911efc64_15_0.fastq -d /home/rayana/Documents/Genomes_ONT_only/AM1001_ONT_contigs.fasta -o medaka_consensus -t $(nproc) -m r941_min_high_g303 (note that I have inputted only one fastq file and have also changed the model to default to make it work))

However, I still have the question: what model do you suggest for me to use based on flowcell 10.4, MinION, and Guppy 6.4.2 run in sup mode?

Thank you in advance.

cjw85 commented 11 months ago

You should use r1041_e82_400bps_sup_g615:

https://github.com/nanoporetech/medaka/issues/442#issuecomment-1606999964

From the README.