Closed anna-parker closed 1 week ago
@fengelniederhammer the backend uses the nucleotide symbol list in the type check: validateNoUnknownNucleotideSymbol
which is used for both unaligned and aligned sequences - it is a correct symbol for aligned sequences so I think it is ok to keep the backend as is.
Isn't it the same issue? In aligned sequences -
should be allowed, in unaligned it is not allowed. We could easily also introduce a new list in the backend that splits the validation.
resolves #
preview URL: https://no-gaps-in-unaligned.loculus.org/
Summary
'-' only makes sense in the context of aligned sequences. It is not accepted by ENA and is not included in official IUPAC lists: https://genome.ucsc.edu/goldenPath/help/iupac.html#:~:text=The%20International%20Union%20of%20Pure,for%20either%20G%20or%20A).