Closed yy1252450987 closed 4 years ago
There is no '< mask >' token in ELMo/SeqVec as it is auto-regressive, i.e. it does not need to mask out tokens during training (like BERT) because it is only trained on predicting the next character in a given sequence. Your '< mask >' is mapped to '< unk >' (unknown character) because it is not a valid amino acid.