Meanings of tokens in alphabet

These correspond to:

<cls> (classification token) in ESM-1 and MSA Transformer, this is the beginning of sentence token.
<pad> (padding token) Enables sequences of variable length in the same batch. The model ignores pad tokens.
<unk> (unknown token) If you use a token that isn't in the trained dictionary, the tokenizer will replace the unknown token with this so that inference will still work. You can ignore . and <null_1>. Additional tokens are often included in embedding dictionaries in order to pad their size to a desired length for computational reasons.

facebookresearch / esm