OpenGVLab / OmniQuant

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
MIT License
663 stars 50 forks source link

quantize custom model trained on alpaca-liked dataset #13

Closed ghost closed 10 months ago

ghost commented 11 months ago

Can you guide me on how to quantize a custom LLama2-13B model finetuned on an alpaca-liked dataset ? Thank you so much!

ChenMnZ commented 11 months ago
  1. set --model as your model path
  2. set --net as Llama-2-13b For example:
    CUDA_VISIBLE_DEVICES=0 python main.py \
    --model /PATH/TO/YOUR/MODEL --eval_ppl \
    --epochs 20 --output_dir /PATH/TO/log \
    --wbits 3 --abits 16 --group_size 128 --lwc \
    --net  Llama-2-13b
ghost commented 11 months ago

@ChenMnZ what about --calib_dataset ? Do I need to adjust this argument? I finetuned this model on custom dataset with format:

{'instruction' : "",
"input": "",
"output": ""}
ChenMnZ commented 11 months ago

We found that current wikitext2 calibration dataset can quantize instruction-tuned models well. Maybe you can try to add --aug_loss, which benefit to some quantization.