Open risedangel opened 7 months ago
Hello, I have a RAG application that i want to use with fastgen. Is it possible to achieve such thing? Or ıs there any way i can "serve" the model and lllama_index can query the model through api ?
I got it working through running it eith openai model serve and https://docs.llamaindex.ai/en/v0.9.48/api_reference/llms/openai_like.html
@risedangel Could you share your implementation?
Hello, I have a RAG application that i want to use with fastgen. Is it possible to achieve such thing? Or ıs there any way i can "serve" the model and lllama_index can query the model through api ?