jagilley / hfppl

Probabilistic programming with HuggingFace language models. Updated with additional examples
https://probcomp.github.io/hfppl/
3 stars 0 forks source link

Add GGUF support for quantized models via `ctransformers` #1

Open jagilley opened 11 months ago

jagilley commented 11 months ago

Should be pretty doable. This model would run nicely on a T4 or equivalent hardware: https://huggingface.co/TheBloke/Llama-2-13B-GGUF

jagilley commented 11 months ago

May not actually be useful given that quantization often works against batching