Closed zixiaotan21 closed 5 months ago
To address the issue where you're encountering a "ValueError: Tokenizer class Qwen2Tokenizer does not exist or is not currently imported" error when using the Qwen-1.5-72B model with langchain-chatchat, follow these steps:
Verify Tokenizer Existence: Ensure there is a tokenizer named Qwen2Tokenizer
designed for the Qwen-1.5-72B model. This might require checking the model's documentation or the source from which the model was obtained.
Implementation and Importation: If Qwen2Tokenizer
exists:
Qwen2Tokenizer
in the script where AutoTokenizer.from_pretrained
is called. If it's part of an external library, ensure the library is installed and imported correctly.Direct Instantiation: If Qwen2Tokenizer
is not recognized by AutoTokenizer.from_pretrained
(likely because it's a custom tokenizer not registered with the transformers library), you may need to instantiate Qwen2Tokenizer
directly with the appropriate model path or identifier.
Custom Tokenizer Registration: For a custom tokenizer, register it with the transformers library to use it with AutoTokenizer.from_pretrained
. This involves adding your tokenizer class to the library's tokenizer class mapping.
Check Model Identifier: Ensure the model identifier "Qwen-1.5-72B" used with AutoTokenizer.from_pretrained
is correct and matches the expected identifier by the transformers library or your custom implementation.
If these steps do not resolve the issue, further specific details about the Qwen2Tokenizer
implementation or consulting the transformers library documentation may be necessary.
问题描述 / Problem Description LLM模型选用Qwen-1.5-72B时,启动langchain-chatchat出现报错
复现问题的步骤 / Steps to Reproduce
预期的结果 / Expected Result 无报错
实际结果 / Actual Result 2024-04-15 09:14:40,683 - startup.py[line:651] - INFO: 正在启动服务: 2024-04-15 09:14:40,684 - startup.py[line:652] - INFO: 如需查看 llm_api 日志,请前往 C:\ai\langchain\logs C:\ai\langchain\nltk_data NLTK_DATA_PATH C:\ai\langchain\nltk_data NLTK_DATA_PATH C:\ai\langchain\nltk_data NLTK_DATA_PATH 2024-04-15 09:14:48 | ERROR | stderr | INFO: Started server process [13400] 2024-04-15 09:14:48 | ERROR | stderr | INFO: Waiting for application startup. 2024-04-15 09:14:48 | ERROR | stderr | INFO: Application startup complete. 2024-04-15 09:14:48 | ERROR | stderr | INFO: Uvicorn running on http://192.168.210.11:20000 (Press CTRL+C to quit) 2024-04-15 09:14:49 | INFO | model_worker | Loading the model ['Qwen-72B-Chat'] on worker c192a079 ... 2024-04-15 09:14:49 | ERROR | stderr | Process model_worker - Qwen-72B-Chat: 2024-04-15 09:14:49 | ERROR | stderr | Traceback (most recent call last): 2024-04-15 09:14:49 | ERROR | stderr | File "C:\Users\fseport\AppData\Local\Programs\Python\Python310\lib\multiprocessing\process.py", line 314, in _bootstrap 2024-04-15 09:14:49 | ERROR | stderr | self.run() 2024-04-15 09:14:49 | ERROR | stderr | File "C:\Users\fseport\AppData\Local\Programs\Python\Python310\lib\multiprocessing\process.py", line 108, in run 2024-04-15 09:14:49 | ERROR | stderr | self._target(*self._args, self._kwargs) 2024-04-15 09:14:49 | ERROR | stderr | File "C:\ai\langchain\startup.py", line 387, in run_model_worker 2024-04-15 09:14:49 | ERROR | stderr | app = create_model_worker_app(log_level=log_level, kwargs) 2024-04-15 09:14:49 | ERROR | stderr | File "C:\ai\langchain\startup.py", line 215, in create_model_worker_app 2024-04-15 09:14:49 | ERROR | stderr | worker = ModelWorker( 2024-04-15 09:14:49 | ERROR | stderr | File "C:\Users\fseport\AppData\Local\Programs\Python\Python310\lib\site-packages\fastchat\serve\model_worker.py", line 77, in init 2024-04-15 09:14:49 | ERROR | stderr | self.model, self.tokenizer = load_model( 2024-04-15 09:14:49 | ERROR | stderr | File "C:\Users\fseport\AppData\Local\Programs\Python\Python310\lib\site-packages\fastchat\model\model_adapter.py", line 265, in load_model 2024-04-15 09:14:49 | ERROR | stderr | model, tokenizer = adapter.load_compress_model( 2024-04-15 09:14:49 | ERROR | stderr | File "C:\Users\fseport\AppData\Local\Programs\Python\Python310\lib\site-packages\fastchat\model\model_adapter.py", line 101, in load_compress_model 2024-04-15 09:14:49 | ERROR | stderr | return load_compress_model( 2024-04-15 09:14:49 | ERROR | stderr | File "C:\Users\fseport\AppData\Local\Programs\Python\Python310\lib\site-packages\fastchat\model\compression.py", line 113, in load_compress_model 2024-04-15 09:14:49 | ERROR | stderr | tokenizer = AutoTokenizer.from_pretrained( 2024-04-15 09:14:49 | ERROR | stderr | File "C:\Users\fseport\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\models\auto\tokenization_auto.py", line 784, in from_pretrained 2024-04-15 09:14:49 | ERROR | stderr | raise ValueError( 2024-04-15 09:14:49 | ERROR | stderr | ValueError: Tokenizer class Qwen2Tokenizer does not exist or is not currently imported.
环境信息 / Environment Information langchain-ChatGLM 版本/commit 号:v0.2.9 是否使用 Docker 部署(是/否):否 使用的模型(ChatGLM2-6B / Qwen-7B 等):Qwen-1.5-72B 使用的 Embedding 模型(moka-ai/m3e-base 等):bge-large-zh 使用的向量库类型 (faiss / milvus / pg_vector 等): faiss 操作系统及版本 / Operating system and version: :Windows-10-10.0.22631-SP0. Python 版本 / Python version: :3.10.8 其他相关环境信息 / Other relevant environment information: