Closed VahidJavaran closed 2 months ago
Could you provide details on the Qscore filtering applied within MinKNOW during the basecalling process?
The qscore thresholds are model dependent. I will get back to you with the exact numbers.
Could you shed some light on how the MinKNOW settings translate to the standalone Dorado command line, specifically regarding the handling of barcodes and scoring to achieve similar results?
The current MinKNOW barcoding algorithm is different from what dorado uses (although this will be unified quite soon). So the parameters are different. In dorado the scoring parameters can be adjusted using this config file. If the thresholds are lowered, the unclassified rate will go down (for both single ended and double ended checks)
Could you provide details on the Qscore filtering applied within MinKNOW during the basecalling process?
The qscore thresholds are model dependent. I will get back to you with the exact numbers.
Could you shed some light on how the MinKNOW settings translate to the standalone Dorado command line, specifically regarding the handling of barcodes and scoring to achieve similar results?
The current MinKNOW barcoding algorithm is different from what dorado uses (although this will be unified quite soon). So the parameters are different. In dorado the scoring parameters can be adjusted using this config file. If the thresholds are lowered, the unclassified rate will go down (for both single ended and double ended checks)
so, to confirm, the dorado basecaller output files are filtered? in the answer to this question it said they were not, which didn't make sense to me since the model descriptions say that have a minimum cutoff score.
dorado standalone (i.e. what you download from this GitHub repo) does not filter anything based on Q score by default.
However when run through MinKNOW a Q score filter is applied to the reads depending on the model used.
but don't you choose a model when running with dorado: hac, sup, or fast? sorry i am just very confused by what the difference is
@gideonav Hi we select which model should be used for basecalling in MinKNOW. But there is no Qscore option to filter low quality reads in MinKNOW. In standalone version, we have this option to filter reads by "--min-qscore". The Qscore in not related directly to bascalling models. You have to set Qscore separately.
Hi @VahidJavaran,
But there is no Qscore option to filter low quality reads in MinKNOW.
Just to jump in here, the MinKNOW filtering options are displayed during run output setup here:
And can be configured by using the cog button on the right:
As @tijyojwad said, the qscore filter chosen is model dependent: 8 for FAST, 9 for HAC, 10 for SUP.
Hope that helps,
@gideonav
but don't you choose a model when running with dorado: hac, sup, or fast? sorry i am just very confused by what the difference is
in standalone dorado, no default Q score filtering is applied regardless of which model is selected.
Hi @VahidJavaran,
But there is no Qscore option to filter low quality reads in MinKNOW.
Just to jump in here, the MinKNOW filtering options are displayed during run output setup here:
And can be configured by using the cog button on the right:
As @tijyojwad said, the qscore filter chosen is model dependent: 8 for FAST, 9 for HAC, 10 for SUP.
Hope that helps,
- George
I think these options can be selected for sequencing with activated basecalling. can these options be applied just for basecalling analysis?
Hi, Regarding the Qscore in the standalone version, what value of Qscore reflects a high confidence base calling ? Many thanks
Hi, Regarding the Qscore in the standalone version, what value of Qscore reflects a high confidence base calling ? Many thanks
@0x55555555 said "8 for FAST, 9 for HAC, 10 for SUP"
HAC and SUP are high and supper accuracy models respectively
Sorry for the late reply @selmapichot - @eesiribloom answered it correctly. I would also add that only HAC and SUP should be used if accuracy is important.
@VahidJavaran
can these options be applied just for basecalling analysis?
Yes I believe these can also be configured for post-run basecalling.
Hi, Many thanks for your reply. Is it possible to do the filtering while basecalling+ aligning with dorado ? if not, could you please advise what would be the best method to filter out low qual reads ? before/after alignment ?
Yes it's possible. If you add --min-qscore X
to the dorado basecaller cmdline it will filter out any reads with mean qscore < X and align all remaining reads.
dorado basecaller model pod5 --min-qscore X --reference ref.fasta
Hello, I've been working with both the integrated Dorado basecaller in MinKNOW and its standalone version one dataset. I have a couple of questions I was hoping you could help me with:
2.I've noticed that in MinKNOW, when I set the "Minimum read splitting score" to 70 and the "Override minimum barcoding score" to 75, and then activate "barcode both ends," the rate of unclassified reads is lower compared to using the --barcode both ends option in Dorado's standalone version. Could you shed some light on how the MinKNOW settings translate to the standalone Dorado command line, specifically regarding the handling of barcodes and scoring to achieve similar results?
Thanks for your assistance,