infiniflow / ragflow

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
https://ragflow.io
Apache License 2.0
18.6k stars 1.89k forks source link

[Feature Request]: Some requirements regarding language, base URL, and model factory: #687

Open dashi6174 opened 5 months ago

dashi6174 commented 5 months ago

Is there an existing issue for the same feature request?

Is your feature request related to a problem?

no

Describe the feature you'd like

一些关于语言、base-url、模型工厂等的需求

  1. 支持设置默认语言 支持在service_conf.yaml中设置默认语言为中文,数据库默认语言也为中文;这样首次登录就不用修改语言,对于在很多用户登录使用时,这点体验很重要;

  2. moonshot 、zhipu-ai、deepseek需要支持base-ur 对于需要联网的模型,期望都可以配置base-url,在公司内部,一般服务器都不联网的,需要通过api代理;

  3. oauth认证 需要支持ouath2.0认证,这样可以很快继承到公司内部的认证系统中,可以快速部署起来给所有人使用;

  4. 文件格式 上传文件时,在文件选择器中过滤下支持的文件格式,当前所有文件都可上传,上传后又提示不支持

  5. 自定义模型工厂 期望在后台service_conf.yaml中自定义模型工厂(不局限于当前内置的Tongyi-Qianwen、OpenAI、ZHIPU-AI、Ollama、Moonshot模型工厂),预先设置默认的chat_model、embedding_model、image2text_model、asr_model,同时可以自由组合支持的各种模型,比如chat_model使用gpt-3.5-turbo,embedding_model使用text-embedding-v2,自定义模型工厂需要支持配置模型的认证信息

  6. 模型下载 支持下载的嵌入模型,比如BAAI和monic-ai系列,bce-embedding-base_v1,当前默认是存在容器的 /root/.cache/huggingface/hub 重建容器就没了,又要重新下载,期望可以docker-cmpose中映射容到器外面;


1.Support for setting the default language Support setting the default language to Chinese in service_conf.yaml, and the default language of the database is also Chinese; this way, users don't need to change the language on their first login, which is very important for the user experience when many users are logging in and using the system.

2.Moonshot, zhipu-ai, and deepseek need to support base-url For models that require network access, it is expected that all of them can be configured with a base-url. In internal company environments, servers are usually not connected to the internet, and access needs to be provided through an API proxy.

3.OAuth authentication Support for OAuth 2.0 authentication is required, so that it can be quickly integrated with the company's internal authentication system and rapidly deployed for everyone to use.

4.File formats When uploading files, filter the supported file formats in the file selector. Currently, all files can be uploaded, but unsupported formats prompt an error after uploading.

5.Custom model factory It is expected to customize the model factory in the backend service_conf.yaml (not limited to the currently built-in Tongyi-Qianwen, OpenAI, ZHIPU-AI, Ollama, and Moonshot model factories). Preset the default chat_model, embedding_model, image2text_model, and asr_model, while also allowing free combination of various supported models. For example, using gpt-3.5-turbo for the chat_model and text-embedding-v2 for the embedding_model. The custom model factory needs to support the configuration of authentication information for the models.

6.Model download Support for downloadable embedding models, such as the BAAI and monic-ai series, bce-embedding-base_v1. Currently, the default location is /root/.cache/huggingface/hub in the container. When the container is rebuilt, the models are lost and need to be downloaded again. It is expected to be able to map the container to an external location using docker-compose.

Describe implementation you've considered

No response

Documentation, adoption, use case

No response

Additional information

No response

KevinHuSh commented 5 months ago

Excellent points. A very profound ragflower.