Closed lgabs closed 5 months ago
@lgabs do you want to do this yourself? Let me know so I can assign this issue to you! =)
@vmesel yes, I want to explore it!
@vmesel, Ok, I understand there were many changes at once. However, I needed to make the whole process to fully understand the final solution. I can break these ideas into smaller steps trying to keep compatibility (although the project is in the beginning and at some time we'll break structures in favor of new ideas and deprecate old ones).
Some questions I have:
About signature plugins, the dialog-whatsapp
plugin would be affected in here and here only? I think with LCEL and chains usage as proposed, any plugins would use the same interface as import a chain and use chain.invoke
to process the user's message, still keeping a simple interface.
And how a contributor is supposed to test plugins (currently just one) for changes? Maybe mocking dialog's call inside plugins? I don't know yet.
They still use postgres, but I've implemented in a more compatible way with LCEL and langchain components (taking advantage of their methods like saving/retrieving messages and embeddings). The main breaking changes I see here are:
(A) the change of collections (schemas), which are already implemented by langchain's class and I think we should follow it so we keep focus on chains and serving them (for message history it keep records as langchain's Message format; for embeddings it keeps a good format including doc's metadata), and indeed this does not change the dialog behavior externally (endpoints work the same way, it would break user's usage if the user is consuming the collections externally somehow);
(B) creating a session before interacting is not necessary anymore, since langchain's PostgresChatMessageHistory
already creates one when it does not exist. This would not break /chat/{chat_id}
actually, but would indeed remove necessity for /session
and would break user's usage.
I think these breaking changes make sense to take most advantage of the langchain framework. Can you think of other breaking changes?
What do you contributors think?
Apart from these topics, I'll update this issue soon with a proposed "roadmap" to achieve LCEL as proposed across the whole process so we can also discuss: embeddings, chains (prompt templating, RAG, routing, LLM call, output parsing) and history management.
Some ideas in the direction of following LCEL:
Current LLM processing in endpoint /chat/{chat_id}
:
LLM
variable, without it's instance (which inherents from AbstractLLM
)LLM
variable with project config and session (which is used to get user history and save messages to the history)llm_instance.process
, and the process
method processes the message until it's final output (in LCEL, this would be the chain invocation through invoke
method)What I'm suggesting:
process
implements the invocation of it's associated chain using LCEL under the hood. This would not change the interface in main.py
(and neither in plugins), since it's still back-compatible and allow new chains as well. process
method. I think most parts of it would not be a problem, but two topics need attention: the retriever (using the vector store) and the memory, both already implemented.
embeddings.py
which already implement manually retrieval from the vector db. I'd have to check how to take advantage of this implementation or adapt langchain's PGVector to find a way to use the retriever as a runnable (all chain components must be langchain runnables ideally)@vmesel , this is a good first step towards adopting LCEL, since it has small changes, but i does not close the issue, I think we should discuss better in the issue (where I gave several ideas and it i'd be good to hear other ideas) how the roadmap could be to really get a LCEL interface which langchain's users will easily understand.
For example, I think that AbstractLLM
forces a flow which langchain's chains already were designed for (the idea of input data, retrieval phase, prompt generation, branches and if-else cases, llm calls, post processing and output parsers), and, I my opinion, to really make dialog flexible with LCEL, the roadmap should go in the direction of removing this custom class in favor of building chains entirely with langchain's objects in every step (runnables), since this is what guarantees new functionalities like streaming, async, parallelism, retries/fallbacks, access to intermediate results, and tracing with langsmith etc. In this direction, I imagine dialog exposing in the api any chain the user wants to build or even combine, with a default chain available (using runnables in all steps). This would demand several changes.
@lgabs when you talk about the LCEL interface, do you mean attaching it directly to the endpoint or creating a instance of the LLM class we already have and implementing necessities there?
We created the class abstraction so we could add more flexibility to pre/post-processing and the processing of the LLM.
The AbstractLLM was created purposefully to make a strict signature that we could call any LLM that we crafted and add the necessary pre/post processing to it.
since this is what guarantees new functionalities like streaming, async, parallelism, retries/fallbacks, access to intermediate results, and tracing with langsmith etc.
If we implement the class having this features, we are still going to have the updated tooling, but with the restrictions of our class.
Langchain introduced in its 0.1.0 version many new concepts centered around the main concept of LangChain Expression Language (LCEL). This is essentially a declarative way to compose chains together in a much more concise and objective manner than before. Chains built with LCEL are designed to offer out-of-the-box functionalities/support like, for example, streaming, async, parallelism, retries/fallbacks, access to intermediate results, and tracing with langsmith. Since this is a significant design change in Langchain and it really clarifies (see examples here) and makes everything more composable, it would be a good idea for us to refactor this project around this core concept. This would require major changes.
Currently, it seems that we process the user's input with a custom LLM instance either from a custom class or a default one, both inheriting from the DialogLLM Class, which in turn inherits from the AbstractLLM Class. The whole process of processing the input query into a prompt engineering with templates, retrieving documents for RAG, building a final prompt with retrieved docs and finally calling an LLM like OpenAI's
gpt-3.5-turbo
is architectured manually compared to what LCEL can offer.This opportunity of following LCEL can be seen in this RAG Cookbook from Langchain's docs, which can guide the changes we need to make to achieve this guideline.