turboderp / exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs
MIT License
3.53k stars 272 forks source link

Support C4AI Command-R #372

Closed djmaze closed 6 months ago

djmaze commented 6 months ago

People say it is really useful for RAG & agent use, and it also has good multilingual support. Would be nice to have an efficient version using EXL2.

JoeySalmons commented 6 months ago

Looks like EXL2 quants are up: https://huggingface.co/turboderp/command-r-v01-35B-exl2

@turboderp Thank you!

turboderp commented 6 months ago

Yep, supported in 0.0.16