Closed ShuheWang1998 closed 2 months ago
Some tokens have different "tokenized ids" in fast and slow modes, so we need to specify the parameter "tokenizer_mode" while loading the model with vllm.
Some tokens have different "tokenized ids" in fast and slow modes, so we need to specify the parameter "tokenizer_mode" while loading the model with vllm.