Closed replikation closed 7 months ago
Hi @replikation,
Thank you for reporting this. The parsing code is written to handle SAM outputs that have been converted to fastq with samtools fastq
. I realised after implementing that outputs from MinKNOW format the comment string in a different fashion but haven't yet had time to implement the logic to parse such data.
Hi @cjw85 Thanks for the quick reply. Yes, it would be great to add this, as it's the default output of MinKNOW. It would also be great to let medaka Exit if it can't find a model instead of using the default one (when you run e.g., medaka_consensus).
Edit: models are currently always listed like this in each read header which is "space" separated: basecall_model_version_id=dna_r10.4.1_e8.2_400bps_sup@v4.2.0
I've done a quick implementation of parsing the Guppy/MinKNOW-style headers. This will appear shortly has version 1.11.2.
Hi
Medaka resolve model is not working on fastq files. Even though the model is listed in the read and it is a valid model in your medaka code.
Running
medaka tools resolve_model
inontresearch/medaka:mr324_shae731e46af6f89b80b1590a9696e6f8630512e358-amd64
causes error:
Read file cat
The basecaller model is written down in the read file (basecall_model_version_id=dna_r10.4.1_e8.2_400bps_sup@v4.2.0). based on a quick glance in your medaka code, you search for some RG comment tags instead of the basecaller_model field?