Open thiner opened 1 year ago
+1
我也是在tokenizer上面出现了问题,用的是Qwen-14B-Chat,用baichuan就正常
~/repo/FastChat$ python -m fastchat.serve.model_worker --model-path ~/repo/models/Qwen-14B-Chat-Int4 --gptq-wbits 4 --gptq-groupsize 128 --model-names gpt-3.5-turbo 2023-09-28 14:36:05 | INFO | model_worker | args: Namespace(host='localhost', port=21002, worker_address='http://localhost:21002', controller_address='http://localhost:21001', model_path='~/repo/models/Qwen-14B-Chat-Int4', revision='main', device='cuda', gpus=None, num_gpus=1, max_gpu_memory=None, dtype=None, load_8bit=False, cpu_offloading=False, gptq_ckpt=None, gptq_wbits=4, gptq_groupsize=128, gptq_act_order=False, awq_ckpt=None, awq_wbits=16, awq_groupsize=-1, model_names=['gpt-3.5-turbo'], conv_template=None, embed_in_truncate=False, limit_worker_concurrency=5, stream_interval=2, no_register=False, seed=None) 2023-09-28 14:36:05 | INFO | model_worker | Loading the model ['gpt-3.5-turbo'] on worker c50b312b ... 2023-09-28 14:36:05 | INFO | stdout | Loading GPTQ quantized model... 2023-09-28 14:36:05 | ERROR | stderr | Traceback (most recent call last): 2023-09-28 14:36:05 | ERROR | stderr | File "<frozen runpy>", line 198, in _run_module_as_main 2023-09-28 14:36:05 | ERROR | stderr | File "<frozen runpy>", line 88, in _run_code 2023-09-28 14:36:05 | ERROR | stderr | File "~/repo/FastChat/fastchat/serve/model_worker.py", line 543, in <module> 2023-09-28 14:36:05 | ERROR | stderr | args, worker = create_model_worker() 2023-09-28 14:36:05 | ERROR | stderr | ^^^^^^^^^^^^^^^^^^^^^ 2023-09-28 14:36:05 | ERROR | stderr | File "~/repo/FastChat/fastchat/serve/model_worker.py", line 518, in create_model_worker 2023-09-28 14:36:05 | ERROR | stderr | worker = ModelWorker( 2023-09-28 14:36:05 | ERROR | stderr | ^^^^^^^^^^^^ 2023-09-28 14:36:05 | ERROR | stderr | File "~/repo/FastChat/fastchat/serve/model_worker.py", line 221, in __init__ 2023-09-28 14:36:05 | ERROR | stderr | self.model, self.tokenizer = load_model( 2023-09-28 14:36:05 | ERROR | stderr | ^^^^^^^^^^^ 2023-09-28 14:36:05 | ERROR | stderr | File "~/repo/FastChat/fastchat/model/model_adapter.py", line 269, in load_model 2023-09-28 14:36:05 | ERROR | stderr | model, tokenizer = load_gptq_quantized(model_path, gptq_config) 2023-09-28 14:36:05 | ERROR | stderr | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2023-09-28 14:36:05 | ERROR | stderr | File "~/repo/FastChat/fastchat/modules/gptq.py", line 43, in load_gptq_quantized 2023-09-28 14:36:05 | ERROR | stderr | tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=False) 2023-09-28 14:36:05 | ERROR | stderr | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2023-09-28 14:36:05 | ERROR | stderr | File "~/miniconda3/lib/python3.11/site-packages/transformers/models/auto/tokenization_auto.py", line 724, in from_pretrained 2023-09-28 14:36:05 | ERROR | stderr | raise ValueError( 2023-09-28 14:36:05 | ERROR | stderr | ValueError: Tokenizer class QWenTokenizer does not exist or is not currently imported.
我已经在服务器上安装了Qwen模型,用Qwen自己的脚本可以启动:
# cd to ~/repo/Qwen python openai_api.py -c ~/repo/models/Qwen-14B-Chat-Int4 --server-name 0.0.0.0 --server-port 4000
+1,me too
我已经在服务器上安装了Qwen模型,用Qwen自己的脚本可以启动: