casper-hansen / AutoAWQ

AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
https://casper-hansen.github.io/AutoAWQ/
MIT License
1.41k stars 160 forks source link

deepseek-coder-v2-instruction-awq #520

Closed fengyang95 closed 1 day ago

fengyang95 commented 1 week ago

I noticed that deepseek-v2 is already supported. Could you please release deepseek-coder-v2-instruct-awq to Hugging Face?

monoluage commented 1 week ago

+1

casper-hansen commented 1 week ago

Which coding datasets are good for this specific model?

fengyang95 commented 1 week ago

Which coding datasets are good for this specific model?

I'm not an expert; perhaps the pile dataset is enough

fengyang95 commented 1 week ago

Which coding datasets are good for this specific model? May I ask how much resources are needed to quantize such a large model with over 200b parameters?Using a private training dataset should yield better results, right?

TechxGenus commented 1 week ago

I recommend using https://huggingface.co/datasets/nickrosh/Evol-Instruct-Code-80k-v1 or https://huggingface.co/datasets/teknium/OpenHermes-2.5.

fengyang95 commented 5 days ago

Which coding datasets are good for this specific model?

May I ask if you have any plans to do this quantification in the near future?

casper-hansen commented 5 days ago

I will release a quantized version of the model as soon as I have time to do it.

casper-hansen commented 1 day ago

This cost me about $110. Hope it suffices. I only ran a test of perplexity so far which landed at 5.325. https://huggingface.co/casperhansen/deepseek-coder-v2-instruct-awq