[BUG] <title>ValueError: Tokenizer class Qwen2Tokenizer does not exist or is not currently imported.

问题描述 / Problem Description LLM模型选用Qwen-1.5-72B时，启动langchain-chatchat出现报错

复现问题的步骤 / Steps to Reproduce

执行 '...' / Run '...'
点击 '...' / Click '...'
滚动到 '...' / Scroll to '...'
问题出现 / Problem occurs

预期的结果 / Expected Result 无报错

环境信息 / Environment Information langchain-ChatGLM 版本/commit 号：v0.2.9 是否使用 Docker 部署（是/否）：否使用的模型（ChatGLM2-6B / Qwen-7B 等）：Qwen-1.5-72B 使用的 Embedding 模型（moka-ai/m3e-base 等）：bge-large-zh 使用的向量库类型 (faiss / milvus / pg_vector 等)： faiss 操作系统及版本 / Operating system and version: ：Windows-10-10.0.22631-SP0. Python 版本 / Python version: ：3.10.8 其他相关环境信息 / Other relevant environment information:

To address the issue where you're encountering a "ValueError: Tokenizer class Qwen2Tokenizer does not exist or is not currently imported" error when using the Qwen-1.5-72B model with langchain-chatchat, follow these steps:

Verify Tokenizer Existence: Ensure there is a tokenizer named Qwen2Tokenizer designed for the Qwen-1.5-72B model. This might require checking the model's documentation or the source from which the model was obtained.
Implementation and Importation: If Qwen2Tokenizer exists:
- Confirm it's correctly implemented in your project or available in a used library.
- Import Qwen2Tokenizer in the script where AutoTokenizer.from_pretrained is called. If it's part of an external library, ensure the library is installed and imported correctly.
Direct Instantiation: If Qwen2Tokenizer is not recognized by AutoTokenizer.from_pretrained (likely because it's a custom tokenizer not registered with the transformers library), you may need to instantiate Qwen2Tokenizer directly with the appropriate model path or identifier.
Custom Tokenizer Registration: For a custom tokenizer, register it with the transformers library to use it with AutoTokenizer.from_pretrained. This involves adding your tokenizer class to the library's tokenizer class mapping.
Check Model Identifier: Ensure the model identifier "Qwen-1.5-72B" used with AutoTokenizer.from_pretrained is correct and matches the expected identifier by the transformers library or your custom implementation.

If these steps do not resolve the issue, further specific details about the Qwen2Tokenizer implementation or consulting the transformers library documentation may be necessary.

Sources

#### About Dosu

This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.

chatchat-space / Langchain-Chatchat

[BUG] <title>ValueError: Tokenizer class Qwen2Tokenizer does not exist or is not currently imported. #3746

Sources