replit / ReplitLM

Inference code and configs for the ReplitLM model family

https://huggingface.co/replit

Apache License 2.0

918 stars 75 forks source link

Update README.md for quantization #20

Closed madhavatreplit closed 1 year ago

madhavatreplit commented 1 year ago

Why

We enabled support for loading models in 8-bit and 4-bit quantization in https://github.com/replit/ReplitLM/pull/19.

We want to update the model README with steps on how to do this.

What changed

We update the README to add steps to load the model in 8-bit and then 4-bit.

Testing

Ran training and instruct fine-tuning with guide successfully.

Rollout

[x] This is fully backward and forward compatible