[Bug]: The first reranked chunk is incorrect when using the Nvidia reranker model

Is there an existing issue for the same bug?

[X] I have checked the existing issues.

Branch name

v0.11.0

Commit ID

2f33ec7ad07db037482ef5cfa58df1b3dd0727a5

Other environment information

AWS
r7i.xlarge EC2 Instance
Ubuntu
Docker container

Actual behavior

The re-ranked chunks are incorrect when using the nvidia/rerank-qa-mistral-4b model for reranking. Typically, the second least relevant chunk is mistakenly placed first, while the second chunk (which should be the most relevant) is ranked lower than it should be.

Expected behavior

The chunks should be returned in the same order they are received from the Nvidia API endpoint.

Steps to reproduce

1. Upload some documents, and preprocess them.
2. Enter the API key to RAGflow from https://build.nvidia.com/nvidia/rerank-qa-mistral-4b
3. Run retrieval test.

Additional information

rerank_model.py seems to be working correctly, the problem is most likely in rag/nlp/search.py.

infiniflow / ragflow