lm-sys / FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Apache License 2.0
36.94k stars 4.55k forks source link

ValueError: Tokenizer class QWenTokenizer does not exist or is not currently imported. #2487

Open thiner opened 1 year ago

thiner commented 1 year ago
~/repo/FastChat$ python -m fastchat.serve.model_worker --model-path ~/repo/models/Qwen-14B-Chat-Int4 --gptq-wbits 4 --gptq-groupsize 128 --model-names gpt-3.5-turbo
2023-09-28 14:36:05 | INFO | model_worker | args: Namespace(host='localhost', port=21002, worker_address='http://localhost:21002', controller_address='http://localhost:21001', model_path='~/repo/models/Qwen-14B-Chat-Int4', revision='main', device='cuda', gpus=None, num_gpus=1, max_gpu_memory=None, dtype=None, load_8bit=False, cpu_offloading=False, gptq_ckpt=None, gptq_wbits=4, gptq_groupsize=128, gptq_act_order=False, awq_ckpt=None, awq_wbits=16, awq_groupsize=-1, model_names=['gpt-3.5-turbo'], conv_template=None, embed_in_truncate=False, limit_worker_concurrency=5, stream_interval=2, no_register=False, seed=None)
2023-09-28 14:36:05 | INFO | model_worker | Loading the model ['gpt-3.5-turbo'] on worker c50b312b ...
2023-09-28 14:36:05 | INFO | stdout | Loading GPTQ quantized model...
2023-09-28 14:36:05 | ERROR | stderr | Traceback (most recent call last):
2023-09-28 14:36:05 | ERROR | stderr |   File "<frozen runpy>", line 198, in _run_module_as_main
2023-09-28 14:36:05 | ERROR | stderr |   File "<frozen runpy>", line 88, in _run_code
2023-09-28 14:36:05 | ERROR | stderr |   File "~/repo/FastChat/fastchat/serve/model_worker.py", line 543, in <module>
2023-09-28 14:36:05 | ERROR | stderr |     args, worker = create_model_worker()
2023-09-28 14:36:05 | ERROR | stderr |                    ^^^^^^^^^^^^^^^^^^^^^
2023-09-28 14:36:05 | ERROR | stderr |   File "~/repo/FastChat/fastchat/serve/model_worker.py", line 518, in create_model_worker
2023-09-28 14:36:05 | ERROR | stderr |     worker = ModelWorker(
2023-09-28 14:36:05 | ERROR | stderr |              ^^^^^^^^^^^^
2023-09-28 14:36:05 | ERROR | stderr |   File "~/repo/FastChat/fastchat/serve/model_worker.py", line 221, in __init__
2023-09-28 14:36:05 | ERROR | stderr |     self.model, self.tokenizer = load_model(
2023-09-28 14:36:05 | ERROR | stderr |                                  ^^^^^^^^^^^
2023-09-28 14:36:05 | ERROR | stderr |   File "~/repo/FastChat/fastchat/model/model_adapter.py", line 269, in load_model
2023-09-28 14:36:05 | ERROR | stderr |     model, tokenizer = load_gptq_quantized(model_path, gptq_config)
2023-09-28 14:36:05 | ERROR | stderr |                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2023-09-28 14:36:05 | ERROR | stderr |   File "~/repo/FastChat/fastchat/modules/gptq.py", line 43, in load_gptq_quantized
2023-09-28 14:36:05 | ERROR | stderr |     tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=False)
2023-09-28 14:36:05 | ERROR | stderr |                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2023-09-28 14:36:05 | ERROR | stderr |   File "~/miniconda3/lib/python3.11/site-packages/transformers/models/auto/tokenization_auto.py", line 724, in from_pretrained
2023-09-28 14:36:05 | ERROR | stderr |     raise ValueError(
2023-09-28 14:36:05 | ERROR | stderr | ValueError: Tokenizer class QWenTokenizer does not exist or is not currently imported.

我已经在服务器上安装了Qwen模型,用Qwen自己的脚本可以启动:

# cd to ~/repo/Qwen
python openai_api.py -c ~/repo/models/Qwen-14B-Chat-Int4 --server-name 0.0.0.0 --server-port 4000
meichangsu1 commented 1 year ago

+1

sjtu-scx commented 9 months ago

我也是在tokenizer上面出现了问题,用的是Qwen-14B-Chat,用baichuan就正常 image

1737686924 commented 7 months ago
~/repo/FastChat$ python -m fastchat.serve.model_worker --model-path ~/repo/models/Qwen-14B-Chat-Int4 --gptq-wbits 4 --gptq-groupsize 128 --model-names gpt-3.5-turbo
2023-09-28 14:36:05 | INFO | model_worker | args: Namespace(host='localhost', port=21002, worker_address='http://localhost:21002', controller_address='http://localhost:21001', model_path='~/repo/models/Qwen-14B-Chat-Int4', revision='main', device='cuda', gpus=None, num_gpus=1, max_gpu_memory=None, dtype=None, load_8bit=False, cpu_offloading=False, gptq_ckpt=None, gptq_wbits=4, gptq_groupsize=128, gptq_act_order=False, awq_ckpt=None, awq_wbits=16, awq_groupsize=-1, model_names=['gpt-3.5-turbo'], conv_template=None, embed_in_truncate=False, limit_worker_concurrency=5, stream_interval=2, no_register=False, seed=None)
2023-09-28 14:36:05 | INFO | model_worker | Loading the model ['gpt-3.5-turbo'] on worker c50b312b ...
2023-09-28 14:36:05 | INFO | stdout | Loading GPTQ quantized model...
2023-09-28 14:36:05 | ERROR | stderr | Traceback (most recent call last):
2023-09-28 14:36:05 | ERROR | stderr |   File "<frozen runpy>", line 198, in _run_module_as_main
2023-09-28 14:36:05 | ERROR | stderr |   File "<frozen runpy>", line 88, in _run_code
2023-09-28 14:36:05 | ERROR | stderr |   File "~/repo/FastChat/fastchat/serve/model_worker.py", line 543, in <module>
2023-09-28 14:36:05 | ERROR | stderr |     args, worker = create_model_worker()
2023-09-28 14:36:05 | ERROR | stderr |                    ^^^^^^^^^^^^^^^^^^^^^
2023-09-28 14:36:05 | ERROR | stderr |   File "~/repo/FastChat/fastchat/serve/model_worker.py", line 518, in create_model_worker
2023-09-28 14:36:05 | ERROR | stderr |     worker = ModelWorker(
2023-09-28 14:36:05 | ERROR | stderr |              ^^^^^^^^^^^^
2023-09-28 14:36:05 | ERROR | stderr |   File "~/repo/FastChat/fastchat/serve/model_worker.py", line 221, in __init__
2023-09-28 14:36:05 | ERROR | stderr |     self.model, self.tokenizer = load_model(
2023-09-28 14:36:05 | ERROR | stderr |                                  ^^^^^^^^^^^
2023-09-28 14:36:05 | ERROR | stderr |   File "~/repo/FastChat/fastchat/model/model_adapter.py", line 269, in load_model
2023-09-28 14:36:05 | ERROR | stderr |     model, tokenizer = load_gptq_quantized(model_path, gptq_config)
2023-09-28 14:36:05 | ERROR | stderr |                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2023-09-28 14:36:05 | ERROR | stderr |   File "~/repo/FastChat/fastchat/modules/gptq.py", line 43, in load_gptq_quantized
2023-09-28 14:36:05 | ERROR | stderr |     tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=False)
2023-09-28 14:36:05 | ERROR | stderr |                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2023-09-28 14:36:05 | ERROR | stderr |   File "~/miniconda3/lib/python3.11/site-packages/transformers/models/auto/tokenization_auto.py", line 724, in from_pretrained
2023-09-28 14:36:05 | ERROR | stderr |     raise ValueError(
2023-09-28 14:36:05 | ERROR | stderr | ValueError: Tokenizer class QWenTokenizer does not exist or is not currently imported.

我已经在服务器上安装了Qwen模型,用Qwen自己的脚本可以启动:

# cd to ~/repo/Qwen
python openai_api.py -c ~/repo/models/Qwen-14B-Chat-Int4 --server-name 0.0.0.0 --server-port 4000

+1,me too