uio-bmi / immuneML

immuneML is a platform for machine learning analysis of adaptive immune receptor repertoire data.
https://immuneml.uio.no
GNU Affero General Public License v3.0
62 stars 29 forks source link

bugfix: if sequence is string of length 0 but not None, this is also considered 'empty' #95

Closed LonnekeScheffer closed 3 years ago

LonnekeScheffer commented 3 years ago

This was an edge case that happened in some adaptive files. In the input file a sequence would be length 2 (for example 'TF'). Then after trimming leading and trailing amino acids, the remaining sequence would be length 0. But this happens after applying "standardize none values", meaning some empty sequences are None, and some "". With this fix, "" sequences are also removed upon import.