nanoporetech / medaka

Sequence correction provided by ONT Research
https://nanoporetech.com
Other
420 stars 74 forks source link

Failed to run medaka consensus #532

Closed zmqstc closed 1 month ago

zmqstc commented 1 month ago

Hello, i tried running medaka_consensus and i am resulting in the following error. The command i used to run medaka is the following :

medaka_consensus -i cluster_001/4_reads.fastq -d cluster_001/7_final_consensus.fasta -o cluster_001/medaka -m /hl/zhoumengqing/software/r1041_e82_400bps_sup_v5.0.0_model.tar.gz -t 24

I am resulting in the following error: Checking program versions This is medaka 2.0.0 Program Version Required Pass
bcftools 1.21 1.11 True
bgzip 1.21 1.11 True
minimap2 2.28 2.11 True
samtools 1.21 1.11 True
tabix 1.21 1.11 True
[22:02:15 - MdlStrTGZ] Successfully removed temporary files from /tmp/tmphtjwgu0b. [22:02:17 - MdlStrTGZ] Successfully removed temporary files from /tmp/tmpmiofmt84. Aligning basecalls to draft Creating fai index file /hl/zhoumengqing/ONT-MGI-2024/1_LAC/Second_2/results/3.2_Trycycler_0.5.4/H11_trycycler/cluster_001/7_final_consensus.fasta.fai Creating mmi index file /hl/zhoumengqing/ONT-MGI-2024/1_LAC/Second_2/results/3.2_Trycycler_0.5.4/H11_trycycler/cluster_001/7_final_consensus.fasta.map-ont.mmi [M::mm_idx_gen::0.0660.99] collected minimizers [M::mm_idx_gen::0.0831.37] sorted minimizers [M::main::0.0991.31] loaded/built the index for 1 target sequence(s) [M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 1 [M::mm_idx_stat::0.1031.30] distinct minimizers: 334375 (97.38% are singletons); average occurrences: 1.070; average spacing: 5.342; total length: 1912037 [M::main] Version: 2.28-r1209 [M::main] CMD: minimap2 -I 16G -x map-ont -d /hl/zhoumengqing/ONT-MGI-2024/1_LAC/Second_2/results/3.2_Trycycler_0.5.4/H11_trycycler/cluster_001/7_final_consensus.fasta.map-ont.mmi /hl/zhoumengqing/ONT-MGI-2024/1_LAC/Second_2/results/3.2_Trycycler_0.5.4/H11_trycycler/cluster_001/7_final_consensus.fasta [M::main] Real time: 0.123 sec; CPU: 0.138 sec; Peak RSS: 0.023 GB [M::main::0.0300.87] loaded/built the index for 1 target sequence(s) [M::mm_mapopt_update::0.0350.89] mid_occ = 18 [M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 1 [M::mm_idx_stat::0.0380.90] distinct minimizers: 334375 (97.38% are singletons); average occurrences: 1.070; average spacing: 5.342; total length: 1912037 [M::worker_pipeline::315.1690.89] mapped 34123 sequences [M::worker_pipeline::635.8780.88] mapped 33362 sequences [M::worker_pipeline::681.1010.87] mapped 4046 sequences [M::main] Version: 2.28-r1209 [M::main] CMD: minimap2 -x map-ont --secondary=no -L --MD -A 2 -B 4 -O 4,24 -E 2,1 -t 1 -a /hl/zhoumengqing/ONT-MGI-2024/1_LAC/Second_2/results/3.2_Trycycler_0.5.4/H11_trycycler/cluster_001/7_final_consensus.fasta.map-ont.mmi /hl/zhoumengqing/ONT-MGI-2024/1_LAC/Second_2/results/3.2_Trycycler_0.5.4/H11_trycycler/cluster_001/4_reads.fastq [M::main] Real time: 681.139 sec; CPU: 589.853 sec; Peak RSS: 1.207 GB [bam_sort_core] merging from 2 files and 1 in-memory blocks... Running medaka consensus [22:14:41 - Predict] Processing region(s): cluster_001_consensus:0-1912037 [22:14:41 - Predict] Using model: /hl/zhoumengqing/software/r1041_e82_400bps_sup_v5.0.0_model.tar.gz. [22:14:42 - Predict] Using minimum mapQ threshold of 1 for read filtering. [22:14:42 - MdlStrTGZ] Successfully removed temporary files from /tmp/tmpxjt4fbri. [22:14:42 - MdlStrTGZ] ModelStoreTGZ exception <class 'ValueError'> Traceback (most recent call last): File "/hl/zhoumengqing/miniconda3/envs/medaka/bin/medaka", line 11, in sys.exit(main()) File "/hl/zhoumengqing/miniconda3/envs/medaka/lib/python3.8/site-packages/medaka/medaka.py", line 835, in main args.func(args) File "/hl/zhoumengqing/miniconda3/envs/medaka/lib/python3.8/site-packages/medaka/prediction.py", line 150, in predict model = model_store.load_model(device=device) File "/hl/zhoumengqing/miniconda3/envs/medaka/lib/python3.8/site-packages/medaka/datastore.py", line 185, in load_model self.model = model_partial_function(time_steps=time_steps) File "/hl/zhoumengqing/miniconda3/envs/medaka/lib/python3.8/site-packages/medaka/models.py", line 348, in build_model raise ValueError("Model format is not supported by medaka v2.x.") ValueError: Model format is not supported by medaka v2.x. Failed to run medaka consensus.

Any help would be appreciated.

ealdraed commented 1 month ago

Hello @zmqstc,

you explicitly specify a TensorFlow model in your command. Medaka 2.x uses PyTorch so either specify the PyTorch version "-m /path/to/r1041_e82_400bps_sup_v5.0.0_model_pt.tar.gz" (note "_pt" ending) or simply the model name "-m r1041_e82_400bps_sup_v5.0.0".

I hope this resolves the issue.

zmqstc commented 1 month ago

@ealdraed Many thanks for the suggestion! I gave it a try and things seem smoothly !

cjw85 commented 1 month ago

medaka v2 will automatically detect the appropriate model to use provided the requisite headers from the sequencing device are present in the input file.