nanoporetech / medaka

Sequence correction provided by ONT Research
https://nanoporetech.com
Other
391 stars 73 forks source link

failed to predict model #507

Closed lucyintheskyzzz closed 1 month ago

lucyintheskyzzz commented 1 month ago

Hi I am trying to figure out all the models for all my fastq.gz files, but I am getting this error message when I run one sample:

(base) [kvigil@qbc141 medaka]$ medaka tools resolve_model --auto_model consensus /work/kvigil/hecatomb/onr.raw.data/M02_barcode01_without_blank.fastq.gz
Cannot import pyabpoa, some features may not be available.
Traceback (most recent call last):
  File "/home/kvigil/.local/bin/medaka", line 8, in <module>
    sys.exit(main())
  File "/home/kvigil/.local/lib/python3.9/site-packages/medaka/medaka.py", line 801, in main
    args = parser.parse_args()
  File "/usr/local/packages/python/3.9.7-anaconda/lib/python3.9/argparse.py", line 1820, in parse_args
    args, argv = self.parse_known_args(args, namespace)
  File "/usr/local/packages/python/3.9.7-anaconda/lib/python3.9/argparse.py", line 1853, in parse_known_args
    namespace, args = self._parse_known_args(args, namespace)
  File "/usr/local/packages/python/3.9.7-anaconda/lib/python3.9/argparse.py", line 2044, in _parse_known_args
    positionals_end_index = consume_positionals(start_index)
  File "/usr/local/packages/python/3.9.7-anaconda/lib/python3.9/argparse.py", line 2021, in consume_positionals
    take_action(action, args)
  File "/usr/local/packages/python/3.9.7-anaconda/lib/python3.9/argparse.py", line 1930, in take_action
    action(self, namespace, argument_values, option_string)
  File "/usr/local/packages/python/3.9.7-anaconda/lib/python3.9/argparse.py", line 1209, in __call__
    subnamespace, arg_strings = parser.parse_known_args(arg_strings, None)
  File "/usr/local/packages/python/3.9.7-anaconda/lib/python3.9/argparse.py", line 1853, in parse_known_args
    namespace, args = self._parse_known_args(args, namespace)
  File "/usr/local/packages/python/3.9.7-anaconda/lib/python3.9/argparse.py", line 2044, in _parse_known_args
    positionals_end_index = consume_positionals(start_index)
  File "/usr/local/packages/python/3.9.7-anaconda/lib/python3.9/argparse.py", line 2021, in consume_positionals
    take_action(action, args)
  File "/usr/local/packages/python/3.9.7-anaconda/lib/python3.9/argparse.py", line 1930, in take_action
    action(self, namespace, argument_values, option_string)
  File "/usr/local/packages/python/3.9.7-anaconda/lib/python3.9/argparse.py", line 1209, in __call__
    subnamespace, arg_strings = parser.parse_known_args(arg_strings, None)
  File "/usr/local/packages/python/3.9.7-anaconda/lib/python3.9/argparse.py", line 1853, in parse_known_args
    namespace, args = self._parse_known_args(args, namespace)
  File "/usr/local/packages/python/3.9.7-anaconda/lib/python3.9/argparse.py", line 2062, in _parse_known_args
    start_index = consume_optional(start_index)
  File "/usr/local/packages/python/3.9.7-anaconda/lib/python3.9/argparse.py", line 2002, in consume_optional
    take_action(action, args, option_string)
  File "/usr/local/packages/python/3.9.7-anaconda/lib/python3.9/argparse.py", line 1930, in take_action
    action(self, namespace, argument_values, option_string)
  File "/home/kvigil/.local/lib/python3.9/site-packages/medaka/medaka.py", line 50, in __call__
    model = medaka.models.model_from_basecaller(input_file, variant=variant)
  File "/home/kvigil/.local/lib/python3.9/site-packages/medaka/models.py", line 146, in model_from_basecaller
    raise KeyError(
KeyError: 'Unknown basecaller model. Please provide a medaka model explicitely using --model.'
(base) [kvigil@qbc141 medaka]$ medaka tools resolve_model --auto_model consensus /work/kvigil/hecatomb/onr.raw.data/M02_barcode01_without_blank.fastq.gz
Cannot import pyabpoa, some features may not be available.
Traceback (most recent call last):
  File "/home/kvigil/.local/bin/medaka", line 8, in <module>
    sys.exit(main())
  File "/home/kvigil/.local/lib/python3.9/site-packages/medaka/medaka.py", line 801, in main
    args = parser.parse_args()
  File "/usr/local/packages/python/3.9.7-anaconda/lib/python3.9/argparse.py", line 1820, in parse_args
    args, argv = self.parse_known_args(args, namespace)
  File "/usr/local/packages/python/3.9.7-anaconda/lib/python3.9/argparse.py", line 1853, in parse_known_args
    namespace, args = self._parse_known_args(args, namespace)
  File "/usr/local/packages/python/3.9.7-anaconda/lib/python3.9/argparse.py", line 2044, in _parse_known_args
    positionals_end_index = consume_positionals(start_index)
  File "/usr/local/packages/python/3.9.7-anaconda/lib/python3.9/argparse.py", line 2021, in consume_positionals
    take_action(action, args)
  File "/usr/local/packages/python/3.9.7-anaconda/lib/python3.9/argparse.py", line 1930, in take_action
    action(self, namespace, argument_values, option_string)
  File "/usr/local/packages/python/3.9.7-anaconda/lib/python3.9/argparse.py", line 1209, in __call__
    subnamespace, arg_strings = parser.parse_known_args(arg_strings, None)
  File "/usr/local/packages/python/3.9.7-anaconda/lib/python3.9/argparse.py", line 1853, in parse_known_args
    namespace, args = self._parse_known_args(args, namespace)
  File "/usr/local/packages/python/3.9.7-anaconda/lib/python3.9/argparse.py", line 2044, in _parse_known_args
    positionals_end_index = consume_positionals(start_index)
  File "/usr/local/packages/python/3.9.7-anaconda/lib/python3.9/argparse.py", line 2021, in consume_positionals
    take_action(action, args)
  File "/usr/local/packages/python/3.9.7-anaconda/lib/python3.9/argparse.py", line 1930, in take_action
    action(self, namespace, argument_values, option_string)
  File "/usr/local/packages/python/3.9.7-anaconda/lib/python3.9/argparse.py", line 1209, in __call__
    subnamespace, arg_strings = parser.parse_known_args(arg_strings, None)
  File "/usr/local/packages/python/3.9.7-anaconda/lib/python3.9/argparse.py", line 1853, in parse_known_args
    namespace, args = self._parse_known_args(args, namespace)
  File "/usr/local/packages/python/3.9.7-anaconda/lib/python3.9/argparse.py", line 2062, in _parse_known_args
    start_index = consume_optional(start_index)
  File "/usr/local/packages/python/3.9.7-anaconda/lib/python3.9/argparse.py", line 2002, in consume_optional
    take_action(action, args, option_string)
  File "/usr/local/packages/python/3.9.7-anaconda/lib/python3.9/argparse.py", line 1930, in take_action
    action(self, namespace, argument_values, option_string)
  File "/home/kvigil/.local/lib/python3.9/site-packages/medaka/medaka.py", line 50, in __call__
    model = medaka.models.model_from_basecaller(input_file, variant=variant)
  File "/home/kvigil/.local/lib/python3.9/site-packages/medaka/models.py", line 146, in model_from_basecaller
    raise KeyError('Unknown basecaller model. Please provide a medaka model explicitely using --model.'
cjw85 commented 1 month ago

Could you please provide a sample of the lines in your file M02_barcode01_without_blank.fastq.gz?

lucyintheskyzzz commented 1 month ago

(base) [kvigil@qbc2 onr.raw.data]$ head M02_barcode01.fastq @b32f2590-167b-4931-85e3-55a69c3f315b runid=d559ed7f061716f0c08cf01d6d1ad021f008fb27 read=16 ch=814 start_time=2023-12-21T09:04:20.786164-06:00 flow_cell_id=PAS49596 protocol_group_id=ONR122123 sample_id=ONR122123 barcode=barcode01 barcode_alias=barcode01 parent_read_id=b32f2590-167b-4931-85e3-55a69c3f315b basecall_model_version_id=dna_r10.4.1_e8.2_400bps_fast@v4.2.0 GTTTCCAGTTCACAATCCGACAGCCAAATTTTCAAACATTATGCGCAGCTCATCACTATATCGAGACACGAGTTAGCAAGTAGCATTTGACTCTGTTTATCGAAATGAATCAAATTATTTGTTTTTGGGAAATACATTCAATTTGATAATGTCAAAAGTGGATTGACTTACAAATAAATCATCATCGATGCCCAGAATCGCTACTCAAAATTTTTATGTTACGCCAGTGCCTAAGTTTCCTCTACACCAAGACTCATTAGTGACAATGGAATACAAAAATATTATTAAGGCCAAATCAAGTACAAAGTGGAAAGGCTGAAACCAAAATTTTAAAAATTTGGAAAAGCGATAGTGACTGGGAAACCA

So I ended up unzipping the fastq.gz to fastq files and here are the sample lines

lucyintheskyzzz commented 1 month ago

Also I want to run all my samples at the same time in medaka, but I know not all the samples will have the same basecaller. I have some older samples that used guppy and newer samples that used dorado? - Can medaka handle this?

lucyintheskyzzz commented 1 month ago

Also my older samples were run on mk1b and newer samples using promethion

cjw85 commented 1 month ago

The header in the fastq shows a "fast" basecaller was used, I believe we have stopped providing medaka models for fast basecallers (only hac and sup).

Medaka processes a single sample, it cannot be used to process multiple samples simultaneously.

lucyintheskyzzz commented 1 month ago

@cjw85 thank you for providing this information, maybe I can re-basecall all my samples using dorado HAC or SUP.