Future of the Project - Githubissues

Hey guys, I'm writing this because this week I was discussing with @avelino and we came to a common ground on: we don't know what we want to achieve with the project anymore.

This project started as a simple RAG PoC, trying to understand how to implement and use langchain and other technologies to create humanized LLMs, but now we are walking through a path that is not a common ground for everyone.

IMO, I think we should aim to be an open-source tech to allow developers to plug and play any LLMs and test them through CLI and REST APIs, also providing ways to easily deploy them and modify the context in order to apply Talkd.ai Dialog in their day-to-day problems.

@avelino, @walison17, @lgabs and any other interested user in this project, could you share your thoughts?

Nice discussion and very important to this project, @vmesel ! This kind of debate helps us to be in the same page about the roadmap, i.e., it helps our collaboration here. I'll give my thoughts here, but I find it useful to give some context about me.

Context

To give some context, I've used some frameworks for chatbots for some time now, including:

open-sourced Rasa: very nice project and community, it helps not only to build but also serve chatbot applications; it's a quite old but active framework built before this "llm times", when and dialogs were designed throught intents and ML models were trained over dialogs of sequential intents to predict the next intent. For some intents, you could have custom actions like calling api's to do more complex things. It was nice, but it was complex to predict dialogs in code, and llm's came and pretty much killed these approaches (now they're in the way to incorporate LLMs) . At that time I've used it (~2020), I found a brazilian project which aimed to wrap this tech in a more production-ready project using rasa, which helped my a lot to put a chatbot in production at work;
Google's Dialogflow: easy to build chatbots in the UI (browser), but very limited. Was good for PoCs with almost no-code requirements. Again, it was based on intent prediction, and only recently it seems to incorporate generative models. But i's paid and only allow Google's models. The same happened when I used IBM Watson.

So far, in all cases I found it difficult not only to build a working application, but to deploy it as well. Now, LLMs introduced new possibilites for dialogs and tasks that could be accomplished with them. In October/2022, Langchain came with the idea of build a new kind of applications, as they say:

LangChain is a framework for developing applications powered by large language models (LLMs).

Later, the community saw the necessity to create technologies to monitor and deploy these new kind of applications, as they say:

LangSmith: A developer platform that lets you debug, test, evaluate, and monitor chains built on any LLM framework and seamlessly integrates with LangChain. LangServe: A library for deploying LangChain chains as REST APIs.

Creating LLM Applications

The number of use cases of langhchain are as big as the number of applications of LLMs, i.e., it's impractical to make a unique product that tackles many different tasks (as a LLM Task), even using these frameworks. When I first used langchain at my current work (last year), I had a clear problem: a chatbot capable of answering FAQs, typical case of customer support where you don't want to let your customer lost in some ruled-decision tree.

This kind of Task is usually solved with RAG technique, which is a better approach for factual recall (there is a very nice OpenAI's cookbook about that) and it's a typical use case of langchain. I've used it with chromadb to build an application (served with django), but it was not clear for me how to deploy my chain properly, and we had some performance issues in the api latency (I didn't test langserve, which appeared by Oct/23).

Then, while I was taking some days off, @avelino told me about the creation of this open-sourced project, following the idea of RAG for Q&A, but delivering a much better production-ready project compared to the PoC I've done, so I immediately liked it, since It could help me and the community to deploy such an LLM application. Today, we saw that langserve's objective is also about deploying chains, but not full LLM Applications.

TL, DR

Now, looking at the current application and using it, I still think this project is about getting a specific (difficult) problem, Q&A with RAG, and building an infrastructure around it to deliver to the community a production-ready LLM Application for this problem. At first, one may think this application is just about answering FAQs like a company's FAQ: you plug a vectordb for retrival and violà, your llm can now answer your specific domain.

It's seams easy in theory (since everyone talks about it now), but practice will demand many custom tools to create, test, evaluate, debug and extend such an application, e. g.: evaluate vector search performance, evaluate llm's answers, setup the vectordb and/or db for memory, allow classification of conversations, use Agent tools to extend actions (call apis or process specific tasks like canceling an order or find something somewhere else the knowledge base) etc. All of this orbit over the same problem of answering factual questions with RAG, and projects like this could empower developers to build and deploy these applications much faster and and with a lot of control and transparency of the whole process.

@lgabs thanks for your whole context and POV. I agree with you on some points, we don't have an easy way to deploy a RAG or any other LLM application yet and we still lack some testing capabilities on most of the frameworks created (even langchain's family is quite limited in some aspects).

The only addition/change to your POV that I would do is that we could start adding features for more general LLM approaches such as toolings or other structures.

One of the frameworks I'm observing right is crewAI -> its pretty cool, brazilian made and brazilian built, it helps us get more architectures for LLMs.

I do agree that we shouldn't focus on multiple LLM approaches right now.

What I have in my mental roadmap:

Add support for native authentication through JWT
Breakdown what is the server and what is a "library" so we can make dialog functional with other frameworks such as Django, Flask or whatever.
Add support for tooling in our base abstract class, in a way that the user just need to add his function and all of the magic is done on background
Create a basic chat frontend so we can test out our applications
Create a basic CLI that would allow us to invoke the LLM from the terminal and test approaches on an input() call
Add support for user tests for LLM through simple configuration

after 3 weeks I've arrived here to share my vision for the project and which path it could follow - sorry for the delay in responding publicly, even though I've shared my vision with all of you via a private call.

Looking at the RAG/LLM "market," it is very much tied to engineers developing code using RAG/LLM and deploying it. I miss a solution that would allow someone with no software engineering knowledge to "play" with developed RAGs and switch them for testing (and even use them in production easily).

In other words, I would steer the talkd/dialog towards being a solution that provides autonomy for someone with Ops knowledge (devops, not necessarily a software engineer) to swap/test other developed RAGs for the talkd/dialog.

The talkd/dialog would have a marketplace of RAGs, making it easy for users to utilize them, for example: by installing the package from PyPI and configuring the class that will be invoked at the endpoint in the .toml.

for engineers

I would split the project into two parts:

talkd/dialog: a lib distributed on PyPI for projects to use talkd/dialog as a framework and co-create atop what we will have in the lib;
talkd/dialog-server: an API/server that serves the HTTP server, developed to put a RAG/LLM solution into production "without reliance" on an engineer - the server uses the talkd/dialog.

Will we no longer be just another "humanized conversation API"?

talkd/dialog will not be, but a RAG will be distributed for this type of solution.

Comment on what you think of this path @walison17, @lgabs, @vmesel

@avelino dialog is already taken in pypi, that's why I've got the dialog-lib for the library.

Looking at the RAG/LLM "market," it is very much tied to engineers developing code using RAG/LLM and deploying it. I miss a solution that would allow someone with no software engineering knowledge to "play" with developed RAGs and switch them for testing (and even use them in production easily).

Engineering perspective: Looking at your perspective, it would make sense to work on UI improvements as well, so we have a front-end for testing and researching new LLMs, also adding support for multiple LLMs while running.

I agree with your vision for the project invocation, just not on the TOML (don't think it's the best interface, but it's just an engineering thing).

Apparently, we have the path we will follow, right?

Now we need to publicly communicate this 'change of course', making it clear:

why it started as a 'humanized conversation API'
why we changed the direction of the project
and that 'humanized conversation API' will become a RAG (plugin) and no longer the only core of the project

we need to tell the story; having a changelog in the documentation would be the best place!

talkdai / dialog

Future of the Project #175

Context

Creating LLM Applications

TL, DR

for engineers

Will we no longer be just another "humanized conversation API"?