lm-sys / FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Apache License 2.0
35.72k stars 4.4k forks source link

FastChat - error on 4bit GPTQ #455

Open steppige opened 1 year ago

steppige commented 1 year ago

Hi, since I updated fastchat to version 0.2.2 I can no longer make the 4-bit GPTQ work because I get this error:

python3 -m fastchat.serve.cli --model-path models/TheBloke_vicuna-7B-1.1-GPTQ-4bit-128g --wbits 4 --groupsize 128 usage: cli.py [-h] [--model-path MODEL_PATH] [--device {cpu,cuda,mps}] [--num-gpus NUM_GPUS] [--load-8bit] [--conv-template CONV_TEMPLATE] [--temperature TEMPERATURE] [--max-new-tokens MAX_NEW_TOKENS] [--style {simple,rich}] [--debug] cli.py: error: unrecognized arguments: --wbits 4 --groupsize 128

How can I fix this? Thank you bye!

zhisbug commented 1 year ago

I don't think we have any official support for GPTQ-4bit. But I'll take a look at GPTQ this week and update on this issue.

alanxmay commented 1 year ago

@zhisbug Hi, I make a new PR to address GPTQ-4bit, can you take a look and give some advice? Thanks! #1209

surak commented 8 months ago

@steppige is this still an issue?