Open thinkall opened 4 months ago
Hello, I was wondering if there was support coming for pgvector, if not I would be happy to contribute.
@thinkall
Hello, I was wondering if there was support coming for pgvector, if not I would be happy to contribute.
Hi @Knucklessg1 , contribution is welcome, thank you for your interest!
Hi @thinkall, would this PR, https://github.com/microsoft/autogen/pull/2046, help out with Automatically decide whether RAG is needed
?
I was thinking if the agent adds a tag like <rag context="some context">
in the message, we can intercept that by one of the hooks or even a reply, and then perform some rag operations
Hi @thinkall, would this PR, #2046, help out with
Automatically decide whether RAG is needed
?I was thinking if the agent adds a tag like
<rag context="some context">
in the message, we can intercept that by one of the hooks or even a reply, and then perform some rag operations
Thank you @WaelKarkoub , interesting idea! Would adding
@thinkall we could define what that tag means by adding attributes (e.g. <rag context="some context" task="search">
could mean it needs to look through some databases) I'm not fully familiar with how rag works, but that tag system should be general enough for multiple use cases.
Great initiative @thinkall. How much are you thinking of reusing what has already been built and proven in other frameworks, like LangChain?
Great initiative @thinkall. How much are you thinking of reusing what has already been built and proven in other frameworks, like LangChain?
Thank you @ChristianWeyer , I hope to support as many established features in the OSS as possible. So we're carefully thinking of the new design of the rag feature in AutoGen. Would you like to share your thoughts?
Great initiative @thinkall. How much are you thinking of reusing what has already been built and proven in other frameworks, like LangChain?
Thank you @ChristianWeyer , I hope to support as many established features in the OSS as possible. So we're carefully thinking of the new design of the rag feature in AutoGen. Would you like to share your thoughts?
One thing is that there are so many connector & retriever implementations in LangChain, that it would not make sense to reinvent the wheel and trying to keep up. Same goes for embedding support.
Great initiative @thinkall. How much are you thinking of reusing what has already been built and proven in other frameworks, like LangChain?
Thank you @ChristianWeyer , I hope to support as many established features in the OSS as possible. So we're carefully thinking of the new design of the rag feature in AutoGen. Would you like to share your thoughts?
One thing is that there are so many connector & retriever implementations in LangChain, that it would not make sense to reinvent the wheel and trying to keep up. Same goes for embedding support.
Agree!
Would you like to have a quick chat on this? It would be great to hear more from you!
@thinkall Will the upcoming RAG update still require using message_generator
in groupchat scenarios? It's my understanding that currently, the RAG agent has to initiate chat and message_generator
has to be used, which results in all initial prompt messages being sent through retrieve_docs in RetrieveUserProxyAgent.
Great initiative @thinkall. How much are you thinking of reusing what has already been built and proven in other frameworks, like LangChain?
Thank you @ChristianWeyer , I hope to support as many established features in the OSS as possible. So we're carefully thinking of the new design of the rag feature in AutoGen. Would you like to share your thoughts?
One thing is that there are so many connector & retriever implementations in LangChain, that it would not make sense to reinvent the wheel and trying to keep up. Same goes for embedding support.
Agree!
Would you like to have a quick chat on this? It would be great to hear more from you!
Sure. I am cethewe in AG Discord.
Hello, I was wondering if there was support coming for pgvector, if not I would be happy to contribute.
Hi @Knucklessg1 , contribution is welcome, thank you for your interest!
Hi @Knucklessg1 , are you in our Discord channel? Could we have a quick chat? Thanks.
@thinkall Will the upcoming RAG update still require using
message_generator
in groupchat scenarios? It's my understanding that currently, the RAG agent has to initiate chat andmessage_generator
has to be used, which results in all initial prompt messages being sent through retrieve_docs in RetrieveUserProxyAgent.
Hi @dsalas-crogl , I'd like to remove the usage of message_generator
, would that benefit your use case? Thanks.
Are you in our Discord channel?
Hello, I was wondering if there was support coming for pgvector, if not I would be happy to contribute.
Hi @Knucklessg1 , contribution is welcome, thank you for your interest!
Hi @Knucklessg1 , are you in our Discord channel? Could we have a quick chat? Thanks.
Yes absolutely. I reached out on Discord.
@thinkall any flow diagram regarding the rag?
@thinkall any flow diagram regarding the rag?
Hi @jamesliu , there's one diagram here, you can find the workflow details in the Introduction section.
interesting roadmap , and i'm very happy with chromadb , looking forward to in memory vector store too , now. if anyone is interested it could be a good opportunity to collaborate and break down complex tasks .
i'll also consider creating + sharing an "advanced upsert" agent , which enriches the text chunks to improve retrieval performance.
Are RAG applications limited to document processing, or do they extend to code-related tasks as well? For instance:
Usecase:
Are RAG applications limited to document processing, or do they extend to code-related tasks as well? For instance:
- Existing code on which we need to run fix? or
- New codes (eg. new service) which goes through incremental development
Usecase:
- In a system comprised of numerous microservices that are regularly updated with new features or fixes, how can an agent acquire knowledge about these changes? Is RAG a viable method for this?
- For the development of new microservices within an existing system, how can knowledge be transferred to agents to enhance the design and implementation process using RAG?
RAG can help if the documents are well organized. For pure code, currently, you can use 3rd party code chunk methods to help load the code into the vector dbs.
Let's also add documentation task to the roadmap? We should have a rag category under https://microsoft.github.io/autogen/docs/topics
Do we also have a task on the roadmap for using custom embeddings? This is a very vital and important requirement for successful RAG. Another would be to use a re-ranking model optionally to improve RAG results. @thinkall
Do we also have a task on the roadmap for using custom embeddings? This is a very vital and important requirement for successful RAG. Another would be to use a re-ranking model optionally to improve RAG results. @thinkall
Custom embeddings are already supported and will also be supported in the new version.
Re-ranking may also be supported, but we may not implement the algorithms, instead we could support plugin different re-ranking models.
Are RAG applications limited to document processing, or do they extend to code-related tasks as well? For instance:
- Existing code on which we need to run fix? or
- New codes (eg. new service) which goes through incremental development
Usecase:
- In a system comprised of numerous microservices that are regularly updated with new features or fixes, how can an agent acquire knowledge about these changes? Is RAG a viable method for this?
- For the development of new microservices within an existing system, how can knowledge be transferred to agents to enhance the design and implementation process using RAG?
RAG can help if the documents are well organized. For pure code, currently, you can use 3rd party code chunk methods to help load the code into the vector dbs.
@thinkall Is it possible for you to provide an example of a 3rd party code chunk method? I'm very interested in extending the knowledge of an agent to my whole codebase :)
I’ve got some thoughts on how we could use agents to automate code generation
+1 for Maxime, it would be really helpful to include a section on code chunking with examples to illustrate the working.
Now, when it comes to generating code, agents shouldn't just spit out code, they should mimic the way engineers think and work in real life. Here’s what I mean:
For the code workflow, we could see something like this:
These steps would be helpful both for rolling out new features or fixes to existing projects and for starting fresh ones. I managed to get a mini reference implementation of a distributed key-value store (80%) using chatgpt(gpt4) and was able to build, test, and run the services locally (screenshot attached). I was experimenting with autogen to reproduce the steps that I have followed and see if I can achieve decent level autonomy (I am sure it will take many iterations :) ). I am still learning and experimenting. I will share my findings as I make progress. [image: Screenshot 2024-04-13 at 10.37.18 AM.png]
Thanks for all the great work and support.
Regards lnr
Message ID: @.***>
@raolak
A solution for this has just been released Checkout https://github.com/princeton-nlp/SWE-agent
@thinkall Is it possible for you to provide an example of a 3rd party code chunk method? I'm very interested in extending the knowledge of an agent to my whole codebase :)
Hi @maximedupre , please check out an example of using 3rd party chunk method here: https://microsoft.github.io/autogen/blog/2023/10/18/RetrieveChat/#customizing-text-split-function
Why RAG
Retrieval-Augmented Generation (RAG) is a technique that enhances the capabilities of LLMs by incorporating a retrieval mechanism into the generative process. This approach allows the model to leverage a vast amount of relevant information from a pre-existing knowledge base, which can significantly improve the quality and accuracy of its generated responses. Thus, for agents chat, incorporating a RAG agent offers several compelling advantages that can significantly enhance the performance and utility of your agent system.
RAG in AutoGen
AutoGen has provided
RetrieveUserProxyAgent
andRetrieveAssistantAgent
for performing RetrieveChat in Aug, 2023 and announced it in blog in Oct, 2023. Given a set of documents, the Retrieval-augmented User Proxy first automatically processes documents—splits, chunks, and stores them in a vector database. Then for a given user input, it retrieves relevant chunks as context and sends it to the Retrieval-augmented Assistant, which uses LLM to generate code or text to answer questions. Agents converse until they find a satisfactory answer.As both AutoGen and RAG are evolving very fast, we find that many users are asking for supports on customized vector databases, incremental document ingesting, customized retrieve/re-ranking algorithms, customized RAG pattern/workflow, etc. We've adjusted some of the issues and feature requests, such as we've added
QdrantRetrieveUserProxyAgent
for using qdrant as the vector db; we've integrated UNSTRUCTURED to support many unstructed documents. However, there are many more to do.Our Plan
In order to better support RAG in AutoGen, we plan to refactor the existing RetrieveChat agents. The goals includes:
Primary goals
Optional goals