juncongmoo / pyllama

LLaMA: Open and Efficient Foundation Language Models
GNU General Public License v3.0
2.8k stars 312 forks source link

Is that possible to quantize a locally converted model, instead of downloading from hugging face? #91

Closed chigkim closed 1 year ago

chigkim commented 1 year ago

I have converted weights using this command.

python -m llama.convert_llama --ckpt_dir ../models --tokenizer_path ../models/tokenizer.model --model_size 7b --output_dir llama-hf

However, I'm trying to quantize the converted model I got from the command above. If I run the following,

python -m llama.llama_quant c4 --ckpt_dir llama-hf/llama-7b --tokenizer_path llama-hf/tokenizer --wbits 4 --groupsize 128 --save pyllama-7B4b.pt

It's throws an error about missing positional argument.

What should I put before c4 so it uses my model on my hard drive instead of downloading from hugging face?

chigkim commented 1 year ago

Sorry, duplicate #60.