mikolmogorov / Flye

De novo assembler for single molecule sequencing reads using repeat graphs
Other
743 stars 164 forks source link

Should I filter based on quality ONT 10.4.1 simplex reads before using --nano-hq? #705

Closed SonjaKersten closed 3 weeks ago

SonjaKersten commented 1 month ago

Hi,

I have ONT 10.4.1 simplex reads called with Dorado v0.6.2 in SUP mode. According to Dorado's own stats the distribution of mean_qscore_template is as in the attached figure. By the way, these scores are NOT based on aligning to a reference.

My question is whether for better performance of Flye --nano-hq I should pre-filter my reads based on these qscore values? If yes, were should I make the cut?

To give you an idea of how much data I would be sacrificing, I should still have at least 50x coverage if I only were to use reads >=Q20 & >=7kbs.

Thanks in advance,

Sonja Quality_scores_Run2

mikolmogorov commented 1 month ago

Hi Sonja,

No need for pre-filtering, --nano-hq should work fine!

Misha