Closed corneliusroemer closed 2 years ago
Hi @corneliusroemer
yes indeed. These are sequences which failed some of our QC. In this case no pango lineage is called. Our readme needs to be updated (see also #3 & #4)
If requested we could provide (a potentially incomplete) new columns which gives a reason if and why QC has failed. Would that be helpful to you?
For me it's fine to just know that if sequences fail QC they don't get a pango lineage.
Exact reason is not necessarily important to me, but may be of interest to submitters.
If you want to do more by way of QC, you could also run Nextclade on the sequences. It gives more details about frame shifts, stop codons, etc.
When joining metadata and pango calls (from Entwicklungslinien) I noticed that about 10k sequences seem to have no pango calls.
What's the reason for this? Did these sequences not pass pango's QC requirements?
Here's a list of the affected sequence ids: missing_pango.csv
Here's head/tail: