talkdai / dialog

RAG LLM Ops App for easy deployment and testing
https://dialog.talkd.ai
MIT License
378 stars 49 forks source link

Future of the Project #175

Closed vmesel closed 5 months ago

vmesel commented 7 months ago

Hey guys, I'm writing this because this week I was discussing with @avelino and we came to a common ground on: we don't know what we want to achieve with the project anymore.

This project started as a simple RAG PoC, trying to understand how to implement and use langchain and other technologies to create humanized LLMs, but now we are walking through a path that is not a common ground for everyone.

IMO, I think we should aim to be an open-source tech to allow developers to plug and play any LLMs and test them through CLI and REST APIs, also providing ways to easily deploy them and modify the context in order to apply Talkd.ai Dialog in their day-to-day problems.

@avelino, @walison17, @lgabs and any other interested user in this project, could you share your thoughts?

lgabs commented 7 months ago

Nice discussion and very important to this project, @vmesel ! This kind of debate helps us to be in the same page about the roadmap, i.e., it helps our collaboration here. I'll give my thoughts here, but I find it useful to give some context about me.

Context

To give some context, I've used some frameworks for chatbots for some time now, including:

So far, in all cases I found it difficult not only to build a working application, but to deploy it as well. Now, LLMs introduced new possibilites for dialogs and tasks that could be accomplished with them. In October/2022, Langchain came with the idea of build a new kind of applications, as they say:

LangChain is a framework for developing applications powered by large language models (LLMs).

Later, the community saw the necessity to create technologies to monitor and deploy these new kind of applications, as they say:

LangSmith: A developer platform that lets you debug, test, evaluate, and monitor chains built on any LLM framework and seamlessly integrates with LangChain. LangServe: A library for deploying LangChain chains as REST APIs.

Creating LLM Applications

The number of use cases of langhchain are as big as the number of applications of LLMs, i.e., it's impractical to make a unique product that tackles many different tasks (as a LLM Task), even using these frameworks. When I first used langchain at my current work (last year), I had a clear problem: a chatbot capable of answering FAQs, typical case of customer support where you don't want to let your customer lost in some ruled-decision tree.

This kind of Task is usually solved with RAG technique, which is a better approach for factual recall (there is a very nice OpenAI's cookbook about that) and it's a typical use case of langchain. I've used it with chromadb to build an application (served with django), but it was not clear for me how to deploy my chain properly, and we had some performance issues in the api latency (I didn't test langserve, which appeared by Oct/23).

Then, while I was taking some days off, @avelino told me about the creation of this open-sourced project, following the idea of RAG for Q&A, but delivering a much better production-ready project compared to the PoC I've done, so I immediately liked it, since It could help me and the community to deploy such an LLM application. Today, we saw that langserve's objective is also about deploying chains, but not full LLM Applications.

TL, DR

Now, looking at the current application and using it, I still think this project is about getting a specific (difficult) problem, Q&A with RAG, and building an infrastructure around it to deliver to the community a production-ready LLM Application for this problem. At first, one may think this application is just about answering FAQs like a company's FAQ: you plug a vectordb for retrival and violà, your llm can now answer your specific domain.

It's seams easy in theory (since everyone talks about it now), but practice will demand many custom tools to create, test, evaluate, debug and extend such an application, e. g.: evaluate vector search performance, evaluate llm's answers, setup the vectordb and/or db for memory, allow classification of conversations, use Agent tools to extend actions (call apis or process specific tasks like canceling an order or find something somewhere else the knowledge base) etc. All of this orbit over the same problem of answering factual questions with RAG, and projects like this could empower developers to build and deploy these applications much faster and and with a lot of control and transparency of the whole process.

vmesel commented 7 months ago

@lgabs thanks for your whole context and POV. I agree with you on some points, we don't have an easy way to deploy a RAG or any other LLM application yet and we still lack some testing capabilities on most of the frameworks created (even langchain's family is quite limited in some aspects).

The only addition/change to your POV that I would do is that we could start adding features for more general LLM approaches such as toolings or other structures.

One of the frameworks I'm observing right is crewAI -> its pretty cool, brazilian made and brazilian built, it helps us get more architectures for LLMs.

I do agree that we shouldn't focus on multiple LLM approaches right now.

What I have in my mental roadmap:

avelino commented 6 months ago

after 3 weeks I've arrived here to share my vision for the project and which path it could follow - sorry for the delay in responding publicly, even though I've shared my vision with all of you via a private call.

Looking at the RAG/LLM "market," it is very much tied to engineers developing code using RAG/LLM and deploying it. I miss a solution that would allow someone with no software engineering knowledge to "play" with developed RAGs and switch them for testing (and even use them in production easily).

In other words, I would steer the talkd/dialog towards being a solution that provides autonomy for someone with Ops knowledge (devops, not necessarily a software engineer) to swap/test other developed RAGs for the talkd/dialog.

The talkd/dialog would have a marketplace of RAGs, making it easy for users to utilize them, for example: by installing the package from PyPI and configuring the class that will be invoked at the endpoint in the .toml.

for engineers

I would split the project into two parts:

  1. talkd/dialog: a lib distributed on PyPI for projects to use talkd/dialog as a framework and co-create atop what we will have in the lib;
  2. talkd/dialog-server: an API/server that serves the HTTP server, developed to put a RAG/LLM solution into production "without reliance" on an engineer - the server uses the talkd/dialog.

Will we no longer be just another "humanized conversation API"?

talkd/dialog will not be, but a RAG will be distributed for this type of solution.


Comment on what you think of this path @walison17, @lgabs, @vmesel

vmesel commented 6 months ago

@avelino dialog is already taken in pypi, that's why I've got the dialog-lib for the library.

Looking at the RAG/LLM "market," it is very much tied to engineers developing code using RAG/LLM and deploying it. I miss a solution that would allow someone with no software engineering knowledge to "play" with developed RAGs and switch them for testing (and even use them in production easily).

Engineering perspective: Looking at your perspective, it would make sense to work on UI improvements as well, so we have a front-end for testing and researching new LLMs, also adding support for multiple LLMs while running.

I agree with your vision for the project invocation, just not on the TOML (don't think it's the best interface, but it's just an engineering thing).

avelino commented 6 months ago

Apparently, we have the path we will follow, right?

Now we need to publicly communicate this 'change of course', making it clear:

we need to tell the story; having a changelog in the documentation would be the best place!