replit / ReplitLM

Inference code and configs for the ReplitLM model family
https://huggingface.co/replit
Apache License 2.0
925 stars 80 forks source link

Update modeling_mpt.py #19

Closed madhavatreplit closed 1 year ago

madhavatreplit commented 1 year ago

Why

We had an open source PR on the HF repo that enables 8-bit and 4-bit quantization: https://huggingface.co/replit/replit-code-v1-3b/discussions/19/files

What changed

Replicated those PR's changes here.

Testing

Tested that loading in 8-bit and 4-bit quantization work.

Rollout