Pretrained Weights for Bloom and BloomZ (4-bit)

Hi, sharing such extremely large models is a bit tricky, I think simply rerunning the code is probably easier (it should work as is for bigscience/bloom; not sure about BLOOMZ, we haven't used that model before). In the past few months, a lot of GPTQ quantized models (produced by follow-up projects to our original paper implementation in this repository) have been uploaded to HuggingFace, perhaps you will find a suitable already quantized model there (e.g., this looks to be 4-bit BLOOM-Z version).

IST-DASLab / gptq

Pretrained Weights for Bloom and BloomZ (4-bit) #10