Closed CFGrote closed 1 year ago
GCA_000009985_1_ASM998v1_genomic.fna is sorted out with message "GCA_000009985_1_ASM998v1_genomic.fna contains non-DNA sequences and will be removed." This should not happen as N's (and other characters as per https://en.wikipedia.org/wiki/Nucleic_acid_sequence#:~:text=The%20possible%20letters%20are%20A,linked%20to%20a%20phosphodiester%20backbone must be accepted.
Seems there are not only ACGT and Ns present but also (e.g. Ms). Will include all letters from the wikipedia link above in the validation.
Fixed by PR #50.
GCA_000009985_1_ASM998v1_genomic.fna is sorted out with message "GCA_000009985_1_ASM998v1_genomic.fna contains non-DNA sequences and will be removed." This should not happen as N's (and other characters as per https://en.wikipedia.org/wiki/Nucleic_acid_sequence#:~:text=The%20possible%20letters%20are%20A,linked%20to%20a%20phosphodiester%20backbone must be accepted.