InftyAI / llmaz

☸️ Easy, advanced inference platform for large language models on Kubernetes. 🌟 Star to support our work!
Apache License 2.0
31 stars 10 forks source link

Support OpenAPI #37

Open kerthcet opened 4 months ago

kerthcet commented 4 months ago

Right now, we support inference engines like vllm for inference, what if people want calling OpenAPIs like chatGPT, it's easy to integrate.

kerthcet commented 4 months ago

/kind feature /priority important-longterm

qinguoyi commented 1 week ago

I think this issue should be the highest priority, so that the conversation UI, etc. can be integrated into the production environment as soon as possible.

I think more people can be interested in participating.

What‘s your opinion? @kerthcet

kerthcet commented 1 week ago

I think we need an unified platform to communicate with all kinds of models, whether OpenAPIs or llmaz API. For OpenAPI, I think we can integrate them in the dashboard part, whether use an opensourced one(which must be easy for secondary development) or a self-developed one by leveraging some tools like https://github.com/BerriAI/litellm, gradio etc..

For llmaz API, we can work on this step by step:

Then it would looks like we have a list of Models

and people can choose to chat with any kind of model. Any suggestions?

qinguoyi commented 1 week ago

I think we need an unified platform to communicate with all kinds of models, whether OpenAPIs or llmaz API. For OpenAPI, I think we can integrate them in the dashboard part, whether use an opensourced one(which must be easy for secondary development) or a self-developed one by leveraging some tools like https://github.com/BerriAI/litellm, gradio etc..

For llmaz API, we can work on this step by step:

  • for the first step, we should be able to list the models/services (create via kubectl command still) for conversations
  • next step, we're able to create models/services, even delete them

Then it would looks like we have a list of Models

  • ChatGPT
  • VertexAI
  • Llama3 in llmaz
  • Qwen in llmaz

and people can choose to chat with any kind of model. Any suggestions?

Sorry, I misunderstood this issuse.

What I mean is that llmaz opens a set of interfaces that conform to openapi input and output, so that users can find usage scenarios as soon as possible.

For example, various open source chatuis can be integrated with this project, and llmaz is used as the backend to allow users to quickly practice customized models.

But this issuse should be more to direct support openapi interface integration rather than model integration

Is that right?

kerthcet commented 1 week ago

I believe we're already APIs following OpenAPI specifications, like we can visit http://localhost:8001/apis/llmaz.io/v1alpha1/openmodels or via client-go to list all the models, we also have python library to query the objs as well.

But yes, the scheme is a little complex comparing to other apis like in vllm, it uses http://localhost:8080/v1/models to fetch the model list.

I have no idea whether other chatbots would like to integrate with our project, and I don't think there's a standard protocol exist, like how the model list api should look like or how to create the model, what's the parameters.

But we can still start with our dashboard and once there's a standard protocol, exporting the apis would be really easy, or we can provide a python library as well.