meta-llama / llama3

The official Meta Llama 3 GitHub site
Other
27.17k stars 3.08k forks source link

The token id exceeds the size of tokenizer.vocab_size #276

Open zcharon opened 4 months ago

zcharon commented 4 months ago

tokenizer.vocab_size=12800, why does token id = 12800 appear? Shouldn't token id < tokenizer.vocab_size? 1

2

subramen commented 3 months ago

I'm not aware of such a constraint. Can you share more details on how this impacts your work?