LLaMA: Open and Efficient Foundation Language Models
GNU General Public License v3.0
2.8k
stars
312
forks
source link
Is that possible to quantize a locally converted model, instead of downloading from hugging face? #91
Closed
chigkim closed 1 year ago
I have converted weights using this command.
python -m llama.convert_llama --ckpt_dir ../models --tokenizer_path ../models/tokenizer.model --model_size 7b --output_dir llama-hf
However, I'm trying to quantize the converted model I got from the command above. If I run the following,
python -m llama.llama_quant c4 --ckpt_dir llama-hf/llama-7b --tokenizer_path llama-hf/tokenizer --wbits 4 --groupsize 128 --save pyllama-7B4b.pt
It's throws an error about missing positional argument.
What should I put before c4 so it uses my model on my hard drive instead of downloading from hugging face?