nextstrain / nextclade

Viral genome alignment, mutation calling, clade assignment, quality checks and phylogenetic placement
https://clades.nextstrain.org
MIT License
219 stars 58 forks source link

if the qc.overallStatus of my sequences are mediocre, can we keep them for next step analysis? #1362

Closed liamxg closed 10 months ago

liamxg commented 10 months ago

Dear @nextclade team, if the qc.overallStatus of my sequences are bad, should we remove them for next step analysis?

corneliusroemer commented 10 months ago

It depends a lot on what you're doing, it also depends on what particular QC rule makes it bad. It could also be that the sequence is perfectly fine just a recombinant.

In general, bad QC just means something potentially bad or interesting might be happening and you should have a closer look if you sequenced it.

liamxg commented 10 months ago

Dear @corneliusroemer, First, if we check the sequence quality, should we see the qc.overallStatus column?

liamxg commented 10 months ago

Dear @corneliusroemer, All of them are download from GISAID. I just upload them to nextclade, and check the nextclde.tsv file.

ivan-aksamentov commented 10 months ago

Dear Liam @liamxg,

Yes, the QC overall status (derived from overall QC score) is an empirical metric which gives you some idea of quality of the genome, according to the beliefs of our team. You can learn more about QC in the documentation and/or by inspecting source code.

QC subsystem is configurable in the dataset (in qc.json file for v2 or in pathogen.json file for v3), so that you can customize it to your needs. Finally, you can implement your own metrics using Nextclade's analysis results or even using aligned sequences.

There is no absolute metric that would tell you what you "should" or "should not" do. Not in Nextclade, not anywhere else. As Cornelius mentioned, Nextclade only tries to attract attention to certain (not all) issues that it detected. The final judgement is yours, and it depends on the goals of your particular research project.

liamxg commented 10 months ago

Dear @ivan-aksamentov,

Thanks. That's very helpful.