langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
https://dify.ai
Other
48.56k stars 6.95k forks source link

Suggest dify to access the BGE RANK rearrangement model #3904

Closed BluerAngala closed 5 months ago

BluerAngala commented 5 months ago

Self Checks

1. Is this request related to a challenge you're experiencing?

The Cohere rearrangement model is not very effective for Chinese in certain situations, such as analyzing legal provisions, and is not as useful as BGE.

2. Describe the feature you'd like to see

Suggest dify to access the BGE RANK rearrangement model

3. How will this feature improve your workflow or experience?

In more rigorous language documents, such as internal rules and regulations of enterprises and institutional norms for social operation, a deep understanding of the language is required. Currently, the effectiveness of using Cohere through API calls is not optimistic, especially with a lack of support for Chinese, which may even lead to poor business performance and urgently need to be improved.

4. Additional context or comments

Here, as a supplement, we provide the relevant images of the bge run base:

Source code deployment method

  1. Installation environment

Python 3.9, 3.10 CUDA 11.7

  1. Download the code link The three model codes are:

https://github.com/labring/FastGPT/tree/main/python/bge-rerank/bge-reranker-base https://github.com/labring/FastGPT/tree/main/python/bge-rerank/bge-reranker-large https://github.com/labring/FastGPT/tree/main/python/bge-rerank/bge-rerank-v2-m3

  1. Installation dependencies Pip install - r requirements. txt

  2. Download the model The huggingface warehouse addresses for the three models are as follows:

    https://huggingface.co/BAAI/bge-reranker-base
    https://huggingface.co/BAAI/bge-reranker-large
    https://huggingface.co/BAAI/bge-rerank-v2-m3

    Clone the model in the corresponding code directory. Directory structure:

    Bge ranker base/
    App. py
    Dockerfile
    Requirements. txt
  3. Run code Python app. py

Docker deployment

The image names are:

Registry. cn hangzhou. aliyuncs. com/fastgpt/bge rank base: v0.1 (4 GB+)
Registry. cn hangzhou. aliyuncs. com/fastgpt/bge rank size: v0.1 (5 GB+)
Registry. cn hangzhou aliyuncs. com/fastgpt/bge rank v2 m3: v0.1 (5 GB+)
port

six thousand and six

environment variable

ACCESS-TOKEN=Access security credentials, when requested, Authorization: Bearer ${ACCESS-TOKEN}
Example of Running a Command

The auth token is mytoken

Docker run - d -- name reranker - p 6006:6006- e ACCESS-TOKEN=mytoken -- gpus all registry. cn hangzhou. aliyuncs. com/fastgpt/bge rank base: v0.1

Docker Compose.yml Example

Version: "3"
Services:
Reranker:
Image: registry. cn hangzhou. aliyuncs. com/fastgpt/bge rank base: v0.1
Containername: reranker
#GPU operating environment, if the host is not installed, hide the deploy configuration
Deploy:
Resources:
Reservations:
Devices:
-Driver: nvidia
Count: all
Capabilities: [gpu]
Ports:
-6006:6006
Environment:
-ACCESS-TOKEN=mytoken

Reference materials https://doc.fastai.site/docs/development/custom-models/bge-rerank/

5. Can you help us with this feature?

JohnJyong commented 5 months ago

Bge-rerank is supported in Xinference , you can use Xinference to use local rerank model, thanks @BluerAngala