Open yangjiabupt opened 10 months ago
the vocab size in config it "vocab_size": 155947
However, the tokenizer vocab is only 155514
The redundant tokens is use for what?
the vocab size in config it "vocab_size": 155947
However, the tokenizer vocab is only 155514
The redundant tokens is use for what?