infiniflow / ragflow

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
https://ragflow.io
Apache License 2.0
24.4k stars 2.37k forks source link

[Question]: What I wanted to say/ask #3760

Open Snify89 opened 3 days ago

Snify89 commented 3 days ago

Describe your problem

Hi there,

this is for all the contributors, community members, etc. out there. I hope it's alright, to compose a little longer Feature Request / Buglist. First of all, I would like to thank you for all your hard work. You are appreciated and hope/wish you all the best for your private lifes as well! We know you are busy and have many things to do/answer/read/etc. This is really hard work. I love that Ragflow brings many nationalities, languages and characters together because it works with so many languages which was a surprise to me. As a German it is difficult to get good German language output from LLMs. Suprisngly, not the west but the east (Asia?), offers better German output than llama, etc. I came across glm4 which is fantastic for German output. Anyway, I came across Ragflow and was astounded, by how easy it was to deploy and set this up. But I was even more astounded about the backend mechanisms and I haven't even looked at Graph, tbh. In short: This framework has strong potential and is exactly what I (and others) wanted/needed and it's easy to use.

So thank you and keep it up!

The project has a few hiccups here and there but it surely works fantastic and is also well documented.

There might be some duplicates in my list and I apologize, that I haven't looked for already other existing open requests/issues.

My feature requests:

Edit:

These are my main features, I'd like to see in the future. So thanks in advance for considering :)

I have a few bugs/suggestions as well:

Otherwise guys, this is the real deal you made here and I am thankful for your hard work and dedication to bring the world closer (by multi language support)!

Thank you for reading and considering any idea/suggestion.

yingfeng commented 3 days ago

Thank you very much for your attention and suggestions, support from the community is our greatest motivation to move forward!

Regarding to your proposals:

  1. OpenAI compatible server API That's a very good offer. We'll consider it.

  2. OCRing with visual models

We'll be adding more similar models, some trained by us and others from the community.

  1. Agent

Agent is still in the process of improvement, it will not only have a shared connection, but also an Agent Store, so that different people can contribute Agents, each Agent is like an App, and RAGFlow is like an App Store.

  1. More language support

Globalization is the goal of RAGFlow. We are indeed understaffed, but we will continue to make improvements to make RAGFlow truly global!

  1. Function calling

There is an Invoke in the Agent, somewhat similar to function calling, fill in the URL, then you can call other services in the Agent. Any further requests for this feature are welcome!

  1. API

There is still a lot of room for improvement in the API, and the file management API will be provided.

  1. Memory management

Memory management is very important and we intend to use ES/Infinity to manage memory and provide real-time search (not just vectors, but full text, etc., which are indexed in real-time). Providing powerful multi-agents based on RAGFlow is one of the goals for next year.

Snify89 commented 3 days ago

Thank you very much. This sounds promisable :) I have another thing: Parsing control is a little too limited. I'd like to autparse upon dataset entry or so and an option, how many maximum parsing threads (actually documents) can be processed at a time. The bulk option is ok, but I'd like to do this automatically with a configurable maximum on how many documents are processed.

Thank you.

Let me know, if you want me to do seperate issues for the suggestions/bugs.

Edit: What about "metadata"? Is keyword(s) the replacement/alternative? Is it sufficient enough to store something like "Document type: Invoice" or "Invoice" ... ?

yingfeng commented 3 days ago

Such control could be provided through API in future.