turboderp / exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs
MIT License
3.45k stars 257 forks source link

CPU offloading? #162

Closed oobabooga closed 9 months ago

Anindyadeep commented 7 months ago

Is CPU Offloading possible @oobabooga