Closed lucasjinreal closed 1 year ago
from llama tokenizer, I saw it was 32k, but somewhere says it's 40k?
The official LLaMA tokenizer vocab size is 32k. I don't think I've seen 40k anywhere. You can check it in the HuggingFace LLaMA config
from llama tokenizer, I saw it was 32k, but somewhere says it's 40k?