nanoporetech / medaka

Sequence correction provided by ONT Research
https://nanoporetech.com
Other
391 stars 73 forks source link

-d must be specified error? #434

Closed lidi4 closed 10 months ago

lidi4 commented 1 year ago

Hello, I am relatively new to this. When I attempt to run medaka, it doesn't run and I get immediate feedback of the command options plus the line "-d must be specified" despite it being specified in the command. I validated the draft's file location and the command itself, and just cannot figure out what more I could be supplying since I am providing the path to the draft assembly for -d. The file is present in the current working directory when I run this command, but I have tried providing the path and running this from a different directory. No matter what variations I've tried, it won't run, and I get the same error. I've checked the draft assembly file and its location countless times I swear!

I am running medaka 1.7.2 in its own fresh dedicated environment (created with mamba because I couldn't get conda to solve environment after many efforts and adjustments). I know the reads and assembly files are not broken because I can use quality checking programs on them and get results. The assembly is from Flye output. The model is also correct. I just am running this on one read file currently. The read file is from a metagenomic sample.

Here is the basics of the command: medaka_consensus -i ~/Test_Data/reads.fastq –d assembly.fasta -o ~/Test_Outputs/medaka –m R941_min_high_g303

Logging medaka 1.7.2

Assembly polishing via neural networks. Medaka is optimized to work with the Flye assembler.

medaka_consensus [-h] -i -d

-h  show this help text.
-i  fastx input basecalls (required).
-d  fasta input assembly (required).
-o  output folder (default: medaka).
-g  don't fill gaps in consensus with draft sequence.
-r  use gap-filling character instead of draft sequence (default: None)
-m  medaka model, (default: r941_min_hac_g507).
    Choices: r103_fast_g507 r103_hac_g507 r103_min_high_g345 r103_min_high_g360 r103_prom_high_g360 r103_sup_g507 r1041_e82_260bps_fast_g632 r1041_e82_260bps_hac_g632 r1041_e82_260bps_sup_g632 r1041_e82_400bps_fast_g615 r1041_e82_400bps_fast_g632 r1041_e82_400bps_hac_g615 r1041_e82_400bps_hac_g632 r1041_e82_400bps_sup_g615 r104_e81_fast_g5015 r104_e81_hac_g5015 r104_e81_sup_g5015 r104_e81_sup_g610 r10_min_high_g303 r10_min_high_g340 r941_e81_fast_g514 r941_e81_hac_g514 r941_e81_sup_g514 r941_min_fast_g303 r941_min_fast_g507 r941_min_hac_g507 r941_min_high_g303 r941_min_high_g330 r941_min_high_g340_rle r941_min_high_g344 r941_min_high_g351 r941_min_high_g360 r941_min_sup_g507 r941_prom_fast_g303 r941_prom_fast_g507 r941_prom_hac_g507 r941_prom_high_g303 r941_prom_high_g330 r941_prom_high_g344 r941_prom_high_g360 r941_prom_high_g4011 r941_prom_sup_g507 r941_sup_plant_g610
    Alternatively a .tar.gz/.hdf file from 'medaka train'.
-f  Force overwrite of outputs (default will reuse existing outputs).
-x  Force recreation of alignment index.
-t  number of threads with which to create features (default: 1).
-b  batchsize, controls memory use (default: 100).

-d must be specified.

Mamba environment & install mamba create -n medaka medaka -c conda-forge -c bioconda

Let me know if anything else is needed, This is just being run on a local laptop no cluster or anything.

raquelgarza commented 1 year ago

I had the same issue. Pretty sure it will work if you set your command as:

medaka_consensus -i~/Test_Data/reads.fastq –dassembly.fasta -o~/Test_Outputs/medaka –mR941_min_high_g303

So, without spaces between the paths or parameters and their flags. Worked for me on a Linux system and medaka 1.6.0.

I wonder if the ~ will be a problem. I usually avoid it. In case you still run into trouble - maybe try setting absolute paths on the parameters.