Closed ashakoen closed 2 months ago
You're welcome! :)
@aredden Kudos! There's a definite need for an open source, performance-oriented (non-interactive) inference server for production use.
This is a very hackable codebase and performance really is 2x other solutions – thank you!
I've been learning a lot by getting this project up and running on a lower-end GPU. I've had no major issues, and I already have the LoRA loading working. Just wanted to say thanks!!