I noticed the following potential QC issues with the 42 sequences submitted by Viscardi et al. when compared against the B.1 outbreak reference:
13 sequences are very overdiverged, with 45 to 1450 SNPs
Almost all sequences have frameshifts: OPG029:152-156, OPG047:477-483 caused by a single insertion
All dates are given as 2023-05 which is surprising for the following reasons:
It is long past the epidemic peak in Europe
It seems unlikely that the lab would sequence so many samples from the same month and none for others
Some sequences cluster have very few if any SNPs compared to other sequences from Europe that were collected in summer 2022, suggesting these sequences are also from summer 2022
Release date: 2024-01-15 Submitter: Viscardi et al., Submitting institution: Istituto Zooprofilattico Sperimentale del Mezzogiorno, Hanimal Health Country: Italy (Sicily/Campania) NCBI virus link: https://www.ncbi.nlm.nih.gov/labs/virus/vssi/#/virus?SeqType_s=Nucleotide&VirusLineage_ss=Monkeypox%20virus,%20taxid:10244&Authors_idx%20q.op%3DAND=viscardi&CreateDate_dt=2024-01-09T00:00:00.00Z%20TO%202024-01-20T23:59:59.00Z Example Genbank: https://www.ncbi.nlm.nih.gov/nuccore/PP098578 Status: Submitter has been contacted (2024-01-19)
List of Genbank accessions
``` PP098578 PP098579 PP098580 PP098581 PP098582 PP098583 PP098584 PP098585 PP098586 PP098587 PP098588 PP098589 PP098590 PP098591 PP098592 PP098593 PP098594 PP098595 PP098596 PP098597 PP098598 PP098599 PP098600 PP098601 PP098602 PP098603 PP098604 PP098605 PP098606 PP098607 PP098608 PP098609 PP098610 PP098611 PP098612 PP098613 PP098614 PP098615 PP098616 PP098617 PP098618 PP098619 ```I noticed the following potential QC issues with the 42 sequences submitted by Viscardi et al. when compared against the B.1 outbreak reference: