Add GGUF support for quantized models via `ctransformers`

jagilley / hfppl

Probabilistic programming with HuggingFace language models. Updated with additional examples

https://probcomp.github.io/hfppl/

3 stars 1 forks source link

Open jagilley opened 1 year ago

jagilley commented 1 year ago

Should be pretty doable. This model would run nicely on a T4 or equivalent hardware: https://huggingface.co/TheBloke/Llama-2-13B-GGUF

jagilley commented 1 year ago

May not actually be useful given that quantization often works against batching