quantize custom model trained on alpaca-liked dataset

OpenGVLab / OmniQuant

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.

MIT License

663 stars 50 forks source link

quantize custom model trained on alpaca-liked dataset #13

Closed ghost closed 10 months ago

ghost commented 11 months ago

Can you guide me on how to quantize a custom LLama2-13B model finetuned on an alpaca-liked dataset ? Thank you so much!

ChenMnZ commented 11 months ago

set --model as your model path

set --net as Llama-2-13b For example:

CUDA_VISIBLE_DEVICES=0 python main.py \
--model /PATH/TO/YOUR/MODEL --eval_ppl \
--epochs 20 --output_dir /PATH/TO/log \
--wbits 3 --abits 16 --group_size 128 --lwc \
--net  Llama-2-13b

ghost commented 11 months ago

@ChenMnZ what about --calib_dataset ? Do I need to adjust this argument? I finetuned this model on custom dataset with format:

{'instruction' : "",
"input": "",
"output": ""}

ChenMnZ commented 11 months ago

We found that current wikitext2 calibration dataset can quantize instruction-tuned models well. Maybe you can try to add --aug_loss, which benefit to some quantization.