Closed KaimingTao closed 2 years ago
Are you trying to record the accessions from SRA/ENA? If so just use the genbank_accession column. The column is currently not used by any program but only for us to link the original sequences.
A similar issue is two or more sequences from the same day, extracted from different places. For example, Williamson21 Day 155.
For Lohr21, it didn't specify the sequence but provided the prevalence of mutations from one day. Currently, I can do two things, 1) separate different mutations to different days to pass the validation, or 2) combine all the mutations into one iso_name.
After discussion, here are some conclusions.
1) for NGS mutations, save count and total to indicate the percentage. 2) for sequences from different places of the body, merge the mutations and use the consensus mutation list.
For example Williamson21