talkdai / dialog

RAG LLM Ops App for easy deployment and testing
https://dialog.talkd.ai
MIT License
358 stars 44 forks source link

Support for one model per endpoint approach #198

Closed vmesel closed 3 months ago

vmesel commented 3 months ago

Hey, I've been wondering what the next big implementation we need to breed here inside @talkdai. After a quick chat with @avelino, we agreed that we need to start working on allowing users to use multiple models in the same deployment, making it less resource-expensive to have multiple LLMs and prompts.

This approach is quite simple, we need to support any LLM class that a user supplies to us through the .toml file and allow the user to choose the URL path for that model and also the prompt it should be using.

A quick draft of this new session in the .toml file looks like this:

[multimodels]
[[endpoint]]
endpoint = "/model1"
class = "dialog-lib.models.Model1"
prompt = "You are a wonderful bot called Justin."

[[endpoint]]
endpoint = "/default"
class = "dialog-lib.models.Model2"
prompt = "You are a nice bot called Mike."

The modification on dialog would be simple, on the loading of the project, our system should iterate through the toml endpoints and get all of the settings, setting up routers using the settings passed.

vmesel commented 3 months ago

https://stackoverflow.com/questions/76635770/how-to-test-fastapi-application-without-sharing-the-same-application-between-tes

avelino commented 3 months ago

@vmesel the pr "only has" python code, don't you need to update the documentation?

how do i configure the models on different endpoints?

do i have to read the code to understand ?

the basis of open source is communication and simplification of use, we are getting both wrong

vmesel commented 3 months ago

@avelino you are right! i missed that part