jeromekelleher / sc2ts

Infer a succinct tree sequence from SARS-COV-2 variation data
MIT License
4 stars 3 forks source link

Uppercase input sequences #184

Closed szhan closed 2 months ago

szhan commented 2 months ago

Sample sequences may be come in lowercase, for example, in the latest batch of sequences kindly prepared by Martin Hunt from the Iqbal group. Here is a simple tweak to convert lowercased sequences into uppercased sequences, which can be mapped to the uppercase nucleotide characters in core.ALLELES. Note that a sequence is stored as a Numpy array (dtype="<U1"), and the entire array is uppercased.