eringill / chronic_infection_python

A simple GUI that allows users to check whether mutations from a SARS-CoV-2 genome best fit a mutational distribution of genomes derived from global, chronic, or deer infections.
https://eringill.shinyapps.io/covid-mutation-distributions/
MIT License
0 stars 0 forks source link

non-cannonical nucleotides? #8

Closed fionabrinkman closed 3 weeks ago

fionabrinkman commented 3 weeks ago

Do you want to throw back an error if they ever include a non-cannonical nucleotide like "F", or anything not in this list for nucleotides? https://www.bioinformatics.org/sms/iupac.html.

eringill commented 3 weeks ago

Users can use either uppercase or lowercase nucleotides including A, C, T, G and U. Users can also indicate indels at specific nucleotide positions, but the way they do this varies (28362Cdel, 27382del_insCT, 3725insC etc.). Because of this variety of entry styles, the app needs to be able to parse the nucleotide position of each list entry, but there may or may not be two nucleotides associated with every list entry.

Therefore, it would be extremely difficult to parse all letters associated with a numeric entry.