Open gkvoelkl opened 1 month ago
Hi @gkvoelkl! Candle-vllm is a great option: you can see an example of how to build such a chatbot here in pure Rust: openai_server.rs.
I would also recommend that you check out mistral.rs as it not only has PagedAttention but Metal, Cpu, vision model, adapter models, quantization, and a plethora of other features including a crate which is meant for usage in an application (docs, examples).
Hi Eric, great rust programm.
I am looking for a crate so I can use a chatbot function within my rust programm. I tried to to that with candle. I hope it will be more documented in den future.
Will it be possible to call a function of candle-vllm without starting an explicit server? So I can use candle-vllm within my programm.
Thanks
Best regards Gerhard