[Bug]: Error occurred while parsing the file using bge-m3 and qwen-7.2b deployed with Infinity and Xinference.

1006076811 commented 1 day ago

Is there an existing issue for the same bug?

[X] I have checked the existing issues.

RAGFlow workspace code commit ID

v0.14.1

RAGFlow image version

v0.14.1

Other environment information

No response

Actual behavior

Error occurred while parsing the file using bge-m3 and qwen-7.2b deployed with Infinity and Xinference.

2024-12-02 15:17:24,915 INFO     1168409 HTTP Request: POST http://127.0.0.1:9997/v1/embeddings "HTTP/1.1 200 OK"
2024-12-02 15:17:25,356 INFO     1168409 set_progress(3d05036eb07d11efa36bd47c44d5570d), progress: 0.6961199294532627, progress_msg: 
2024-12-02 15:17:32,213 INFO     1168409 HTTP Request: POST http://127.0.0.1:9997/v1/embeddings "HTTP/1.1 200 OK"
2024-12-02 15:17:32,819 INFO     1168409 set_progress(3d05036eb07d11efa36bd47c44d5570d), progress: 0.700352733686067, progress_msg: 
2024-12-02 15:17:39,757 INFO     1168409 HTTP Request: POST http://127.0.0.1:9997/v1/embeddings "HTTP/1.1 500 Internal Server Error"
2024-12-02 15:17:39,759 INFO     1168409 Retrying request to /embeddings in 0.848969 seconds
2024-12-02 15:17:40,638 INFO     1168409 HTTP Request: POST http://127.0.0.1:9997/v1/embeddings "HTTP/1.1 400 Bad Request"
2024-12-02 15:17:40,660 INFO     1168409 set_progress(3d05036eb07d11efa36bd47c44d5570d), progress: -1, progress_msg: Page(1~100000001): [ERROR]Embedding error:Error code: 400 - {'detail': '[address=0.0.0.0:28547, pid=140] Model not found, uid: bge-m3-0'}
2024-12-02 15:17:40,672 ERROR    1168409 run_rembedding got exception
Traceback (most recent call last):
  File "/opt/PythonProjects/szzz-llm-rag/rag/svr/task_executor.py", line 403, in do_handle_task
    tk_count, vector_size = embedding(cks, embd_mdl, r["parser_config"], callback)
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/PythonProjects/szzz-llm-rag/rag/svr/task_executor.py", line 307, in embedding
    vts, c = mdl.encode(cnts[i: i + batch_size])
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/PythonProjects/szzz-llm-rag/api/db/services/llm_service.py", line 209, in encode
    emd, used_tokens = self.mdl.encode(texts, batch_size)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/PythonProjects/szzz-llm-rag/rag/llm/embedding_model.py", line 282, in encode
    res = self.client.embeddings.create(input=texts,
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/media/user/data_one/miniconda3/envs/szzz-llm-rag/lib/python3.11/site-packages/openai/resources/embeddings.py", line 125, in create
    return self._post(
           ^^^^^^^^^^^
  File "/media/user/data_one/miniconda3/envs/szzz-llm-rag/lib/python3.11/site-packages/openai/_base_client.py", line 1260, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/media/user/data_one/miniconda3/envs/szzz-llm-rag/lib/python3.11/site-packages/openai/_base_client.py", line 937, in request
    return self._request(
           ^^^^^^^^^^^^^^
  File "/media/user/data_one/miniconda3/envs/szzz-llm-rag/lib/python3.11/site-packages/openai/_base_client.py", line 1026, in _request
    return self._retry_request(
           ^^^^^^^^^^^^^^^^^^^^
  File "/media/user/data_one/miniconda3/envs/szzz-llm-rag/lib/python3.11/site-packages/openai/_base_client.py", line 1075, in _retry_request
    return self._request(
           ^^^^^^^^^^^^^^
  File "/media/user/data_one/miniconda3/envs/szzz-llm-rag/lib/python3.11/site-packages/openai/_base_client.py", line 1041, in _request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'detail': '[address=0.0.0.0:28547, pid=140] Model not found, uid: bge-m3-0'}
2024-12-02 15:17:40,676 ERROR    1168409 handle_task got exception for task {"id": "3d05036eb07d11efa36bd47c44d5570d", "doc_id": "5b8f3452a55711ef8de0d47c44d5570d", "from_page": 0, "to_page": 100000000, "retry_count": 0, "kb_id": "4e6f8dc6a55711efa604d47c44d5570d", "parser_id": "qa", "parser_config": {"raptor": {"use_raptor": false}}, "name": "question_to_sql.csv", "type": "doc", "location": "question_to_sql.csv", "size": 926118, "tenant_id": "3c1a1b4e797b11efb0ead47c44d5570d", "language": "Chinese", "embd_id": "bge-m3@Xinference", "img2txt_id": "qwen-vl-max", "asr_id": "paraformer-realtime-8k-v1", "llm_id": "qwen2.5-instruct@Xinference", "update_date": 1733123741301}

Expected behavior

No response

Steps to reproduce

The model deployed with Xinference uses a QA-based approach for knowledge base parsing, and Infinity is used as the vector database.

Additional information

No response

JinHai-CN commented 1 day ago

Obviously, the model deployed by xinference can't be accessed.

1006076811 commented 1 day ago

Obviously, the model deployed by xinference can't be accessed.

The error message is to use the model id: BGE-M3-0, but I have not configured BGE-m3-0, I have only configured bge-m3, I suspect that there is an extra -0 which causes the access failure. Because the first few times of embeding can be carried out normally, I do not know which step will become bge-m3-0

1006076811 commented 1 day ago

I see. My Batch_size is set too large, which causes the video memory to exceed. Because xinference is deployed using docker, I did not see the error message

infiniflow / ragflow