turboderp / exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs
MIT License
3.2k stars 236 forks source link

Is there a google colab to quantize llms using exllamav2? #333

Closed JamesKnight0001 closed 2 weeks ago

JamesKnight0001 commented 4 months ago

I do not have the full requirements to quantize a 13b parameter model and don't want to waste credits in Vast.AI.

Anthonyg5005 commented 4 months ago

This is probably something to ask in discussions but here I found a colab from llm-course Colab

turboderp commented 2 weeks ago

There is that, but also the converter is designed to be run locally. I.e. if you can run the quantized model, you should also be able to quantize it.