Open zcharon opened 4 months ago
tokenizer.vocab_size=12800, why does token id = 12800 appear? Shouldn't token id < tokenizer.vocab_size?
I'm not aware of such a constraint. Can you share more details on how this impacts your work?
tokenizer.vocab_size=12800, why does token id = 12800 appear? Shouldn't token id < tokenizer.vocab_size?