mpievolbio-scicomp / rarefan

http://rarefan.evolbio.mpg.de
MIT License
1 stars 0 forks source link

Accept non ACGT characters in sequence uploads #50

Closed CFGrote closed 1 year ago

CFGrote commented 1 year ago

GCA_000009985_1_ASM998v1_genomic.fna is sorted out with message "GCA_000009985_1_ASM998v1_genomic.fna contains non-DNA sequences and will be removed." This should not happen as N's (and other characters as per https://en.wikipedia.org/wiki/Nucleic_acid_sequence#:~:text=The%20possible%20letters%20are%20A,linked%20to%20a%20phosphodiester%20backbone must be accepted.

CFGrote commented 1 year ago

Seems there are not only ACGT and Ns present but also (e.g. Ms). Will include all letters from the wikipedia link above in the validation.

CFGrote commented 1 year ago

Fixed by PR #50.