wukevin / tcr-bert

Large language modeling applied to T-cell receptor (TCR) sequences.
Apache License 2.0
47 stars 8 forks source link

Encode wildcard residues #2

Open chenmc1996 opened 2 years ago

chenmc1996 commented 2 years ago

Thanks for the good work, is it possible to input AA seq with wildcard residues?

wukevin commented 2 years ago

While it is technically possible, the wildcard would simply become an "unknown" token and wouldn't carry any intrinsic meaning.