IST-DASLab / gptq

Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
https://arxiv.org/abs/2210.17323
Apache License 2.0
1.81k stars 145 forks source link

Pretrained Weights for Bloom and BloomZ (4-bit) #10

Closed agemagician closed 1 year ago

agemagician commented 1 year ago

Hi,

Thanks a lot for the excellent work.

Could you share with us a pretrained weights for Bloom and BloomZ (4-bits) ?

efrantar commented 1 year ago

Hi, sharing such extremely large models is a bit tricky, I think simply rerunning the code is probably easier (it should work as is for bigscience/bloom; not sure about BLOOMZ, we haven't used that model before). In the past few months, a lot of GPTQ quantized models (produced by follow-up projects to our original paper implementation in this repository) have been uploaded to HuggingFace, perhaps you will find a suitable already quantized model there (e.g., this looks to be 4-bit BLOOM-Z version).