Closed FrostFlow13 closed 11 months ago
Yes, r1041_e82_400bps_hac_g632
is the correct model for that Guppy basecaller version.
The medaka and basecaller models are paired, and using a newer medaka model with an older basecaller will probably yield worse results.
Thank you - I appreciate the confirmation!
I'm currently trying to run medaka_consensus on ONT long-read data, and I'm running into some frustrations in trying to figure out which basecalling model to use. I see that there's a (seemingly outdated) guide under "Models" that gives instructions on how to figure out what you should use, but I also see that the newest versions have dropped the nice and clear names based on the Guppy models. Looking around in the issues, I see that this sort of thing has been asked before (by a couple of different users).
To give additional information on my long-read data:
I've also attached the .md and .json (.txt version) report files we were provided in-case those give any additional information needed that I didn't include.
For my specifications, what consensus model should I be using:
r1041_e82_400bps_hac_g632
,r1041_e82_400bps_hac_v4.0.0
, orr1041_e82_400bps_hac_v4.1.0
? Or one I didn't list?I believe that a safe bet for me would be to use the
r1041_e82_400bps_hac_g632
option, as it fulfills the "Models" section's suggestion of, "Where a version of Guppy has been used without an exactly corresponding medaka model, the medaka model with the highest version equal to or less than the guppy version should be selected." However, if one of the newer models would be better/are improved over the g632 version and are still compatible with the pieces I'm using, I definitely want to go with one of those!Additionally, I think it might be helpful to update the "Models" section to include some information on the new naming system for the models.
report_PAO85317_20230615_1429_373a141b.md report_PAO85317_20230615_1429_373a141b.json.txt