infiniflow / ragflow

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
https://ragflow.io
Apache License 2.0
9.36k stars 882 forks source link

[Feature Request]: Better project dependency management #1052

Open yingfeng opened 2 weeks ago

yingfeng commented 2 weeks ago

Is there an existing issue for the same feature request?

Describe the feature you'd like

Discussed in https://github.com/orgs/infiniflow/discussions/1269

Originally posted by **http403** June 1, 2024 Hi community and members of infiniflow, Will the community be interested in pausing the development of RAGFlow a bit, and restructure the project using better dependency management tool? I'm trying to make the Docker image smaller by cutting out the GPU specific dependencies, which I use Poetry to aid me, and I discover few dependencies conflicts: - `volcengine` need `pycryptodome==3.9.9` got `pycryptodome=3.20.0` - `volcengine` need `pytz==2020.5` got `pytz==2024.1` - `bcembedding` need `transformers>=4.35.0,<4.37.0` got `transformer==4.38.1` Note: `bcembedding` and `volcengine` aren't version pinned It will be nice to use some form of dependency management tools like Poetry or Pipenv to avoid such issues. Not to mention PyCryptodome 3.9.9 and `pytz` both released in 2020, which are very old which PyCryptodome have CVE-2023-52323 vulnerability before version 3.19.1. Again, I don't mind chime in my time to make it happen.
http403 commented 2 weeks ago

The author of the discussion is here. Which project dependency management would the InfiniFlow/RAGFlow team prefer? I'm more experienced with Pipenv and less with Poetry. I don't mind learn another one if that is more suitable to your workflow. There is a comparison blog which you can reference.

KevinHuSh commented 2 weeks ago

The author of the discussion is here. Which project dependency management would the InfiniFlow/RAGFlow team prefer? I'm more experienced with Pipenv and less with Poetry. I don't mind learn another one if that is more suitable to your workflow. There is a comparison blog which you can reference.

Let's use Poetry.

CamusGao commented 2 weeks ago

Besides, is there any possibilities to separating the model provider and the main program into different projects, and provide a thin version of the program that only supports the OpenAI-like API?

http403 commented 2 weeks ago

@KevinHuSh I will start working on it, no specific timeline though. Expect dependency version conflicts.