I am interested in curating the representative genomes for VOCs/VBMs. And according to your recent publication (Xiaoli, Lingzi, et al. "Benchmark datasets for SARS-CoV-2 surveillance bioinformatics." PeerJ 10 (2022): e13821.), your dataset 4&5 were prepared based on alignments to the 'internally curated consensus sequences'. May I ask for the details about how you curated those internally? Thank you.
If a representative sequence is not present, pull the longest/cleanest available sequence from the lineage (least amount of mixed bases and missing data) with the earliest collection date
Hi TOAST team,
I am interested in curating the representative genomes for VOCs/VBMs. And according to your recent publication (Xiaoli, Lingzi, et al. "Benchmark datasets for SARS-CoV-2 surveillance bioinformatics." PeerJ 10 (2022): e13821.), your dataset 4&5 were prepared based on alignments to the 'internally curated consensus sequences'. May I ask for the details about how you curated those internally? Thank you.
Best, Gyuhyon