lm-sys / FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Apache License 2.0
36.59k stars 4.52k forks source link

Trying to load a safetensors file #1530

Open shaunstoltz opened 1 year ago

shaunstoltz commented 1 year ago

Getting this error:

Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory models/.

cmd:

python3 -m fastchat.serve.cli --model-path models/ --num-gpus 4

vchauhan1 commented 1 year ago

You can use commands like below:

python3 -m fastchat.serve.model_worker \ --model-path models/vicuna-7B-1.1-GPTQ-4bit-128g \ --gptq-ckpt models/vicuna-7B-1.1-GPTQ-4bit-128g/vicuna-7B-1.1-GPTQ-4bit-128g.safetensors \ --gptq-wbits 4 \ --gptq-groupsize 128 \ --gptq-act-order

To ^^ this need to install: https://github.com/qwopqwop200/GPTQ-for-LLaMa/tree/fastest-inference-4bit

Whole thing is here, still not merged:

https://github.com/alanxmay/FastChat/tree/fastest-gptq-4bit-support