casper-hansen / AutoAWQ

AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
https://casper-hansen.github.io/AutoAWQ/
MIT License
1.72k stars 204 forks source link

Discussion on the selection of the calibration data set #541

Open beep-bebop opened 3 months ago

beep-bebop commented 3 months ago

Is using SFT data directly as calibration data the best option? Does it cause performance fluctuations when I have more (e.g. 28w) or less (e.g. 1k) fine-tuned data? Also, would using datasets from other domains as calibration datasets cause a gain or huge loss in performance? Thanks for sharing your practical experience.

wangzhongren-code commented 1 month ago

Hi, I have recently also been considering using my own calibration dataset. Could you provide any guidance or tutorials on how to use a custom calibration dataset? Thank you in advance for your kind cooperation.