Closed Anthonyg5005 closed 8 months ago
If you're interested in performance, you need to look into aphrodite as you'll get 10x the performance easily https://github.com/PygmalionAI/aphrodite-engine
Alternatively koboldcpp is also blazing fast
I'll check out aphrodite. I was looking for something that could use exl2 but I'll see if I can figure out awq
Seems like I'll use koboldcpp. Tried aphrodite but it took hours to get working and awq wasn't worth it. While the gguf gave me a bit lower accuracy, it is similar speed to what I was getting with exl2
Seems like aphrodite got exl2 support
I think it'd be a good idea to support it as I'd like to be able to use Text-gen-webui instead of KoboldAI as I get a 4x increase in performance. I know there's a pull request for it but #278 uses an outdated API.