infiniflow / ragflow

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
https://ragflow.io
Apache License 2.0
24.46k stars 2.38k forks source link

[Bug]: Error occurred while parsing the file using bge-m3 and qwen-7.2b deployed with Infinity and Xinference. #3799

Closed 1006076811 closed 1 day ago

1006076811 commented 1 day ago

Is there an existing issue for the same bug?

RAGFlow workspace code commit ID

v0.14.1

RAGFlow image version

v0.14.1

Other environment information

No response

Actual behavior

Error occurred while parsing the file using bge-m3 and qwen-7.2b deployed with Infinity and Xinference.

2024-12-02 15:17:24,915 INFO     1168409 HTTP Request: POST http://127.0.0.1:9997/v1/embeddings "HTTP/1.1 200 OK"
2024-12-02 15:17:25,356 INFO     1168409 set_progress(3d05036eb07d11efa36bd47c44d5570d), progress: 0.6961199294532627, progress_msg: 
2024-12-02 15:17:32,213 INFO     1168409 HTTP Request: POST http://127.0.0.1:9997/v1/embeddings "HTTP/1.1 200 OK"
2024-12-02 15:17:32,819 INFO     1168409 set_progress(3d05036eb07d11efa36bd47c44d5570d), progress: 0.700352733686067, progress_msg: 
2024-12-02 15:17:39,757 INFO     1168409 HTTP Request: POST http://127.0.0.1:9997/v1/embeddings "HTTP/1.1 500 Internal Server Error"
2024-12-02 15:17:39,759 INFO     1168409 Retrying request to /embeddings in 0.848969 seconds
2024-12-02 15:17:40,638 INFO     1168409 HTTP Request: POST http://127.0.0.1:9997/v1/embeddings "HTTP/1.1 400 Bad Request"
2024-12-02 15:17:40,660 INFO     1168409 set_progress(3d05036eb07d11efa36bd47c44d5570d), progress: -1, progress_msg: Page(1~100000001): [ERROR]Embedding error:Error code: 400 - {'detail': '[address=0.0.0.0:28547, pid=140] Model not found, uid: bge-m3-0'}
2024-12-02 15:17:40,672 ERROR    1168409 run_rembedding got exception
Traceback (most recent call last):
  File "/opt/PythonProjects/szzz-llm-rag/rag/svr/task_executor.py", line 403, in do_handle_task
    tk_count, vector_size = embedding(cks, embd_mdl, r["parser_config"], callback)
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/PythonProjects/szzz-llm-rag/rag/svr/task_executor.py", line 307, in embedding
    vts, c = mdl.encode(cnts[i: i + batch_size])
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/PythonProjects/szzz-llm-rag/api/db/services/llm_service.py", line 209, in encode
    emd, used_tokens = self.mdl.encode(texts, batch_size)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/PythonProjects/szzz-llm-rag/rag/llm/embedding_model.py", line 282, in encode
    res = self.client.embeddings.create(input=texts,
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/media/user/data_one/miniconda3/envs/szzz-llm-rag/lib/python3.11/site-packages/openai/resources/embeddings.py", line 125, in create
    return self._post(
           ^^^^^^^^^^^
  File "/media/user/data_one/miniconda3/envs/szzz-llm-rag/lib/python3.11/site-packages/openai/_base_client.py", line 1260, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/media/user/data_one/miniconda3/envs/szzz-llm-rag/lib/python3.11/site-packages/openai/_base_client.py", line 937, in request
    return self._request(
           ^^^^^^^^^^^^^^
  File "/media/user/data_one/miniconda3/envs/szzz-llm-rag/lib/python3.11/site-packages/openai/_base_client.py", line 1026, in _request
    return self._retry_request(
           ^^^^^^^^^^^^^^^^^^^^
  File "/media/user/data_one/miniconda3/envs/szzz-llm-rag/lib/python3.11/site-packages/openai/_base_client.py", line 1075, in _retry_request
    return self._request(
           ^^^^^^^^^^^^^^
  File "/media/user/data_one/miniconda3/envs/szzz-llm-rag/lib/python3.11/site-packages/openai/_base_client.py", line 1041, in _request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'detail': '[address=0.0.0.0:28547, pid=140] Model not found, uid: bge-m3-0'}
2024-12-02 15:17:40,676 ERROR    1168409 handle_task got exception for task {"id": "3d05036eb07d11efa36bd47c44d5570d", "doc_id": "5b8f3452a55711ef8de0d47c44d5570d", "from_page": 0, "to_page": 100000000, "retry_count": 0, "kb_id": "4e6f8dc6a55711efa604d47c44d5570d", "parser_id": "qa", "parser_config": {"raptor": {"use_raptor": false}}, "name": "question_to_sql.csv", "type": "doc", "location": "question_to_sql.csv", "size": 926118, "tenant_id": "3c1a1b4e797b11efb0ead47c44d5570d", "language": "Chinese", "embd_id": "bge-m3@Xinference", "img2txt_id": "qwen-vl-max", "asr_id": "paraformer-realtime-8k-v1", "llm_id": "qwen2.5-instruct@Xinference", "update_date": 1733123741301}

Expected behavior

No response

Steps to reproduce

The model deployed with Xinference uses a QA-based approach for knowledge base parsing, and Infinity is used as the vector database.

Additional information

No response

JinHai-CN commented 1 day ago

Obviously, the model deployed by xinference can't be accessed.

1006076811 commented 1 day ago

Obviously, the model deployed by xinference can't be accessed.

The error message is to use the model id: BGE-M3-0, but I have not configured BGE-m3-0, I have only configured bge-m3, I suspect that there is an extra -0 which causes the access failure. Because the first few times of embeding can be carried out normally, I do not know which step will become bge-m3-0

1006076811 commented 1 day ago

I see. My Batch_size is set too large, which causes the video memory to exceed. Because xinference is deployed using docker, I did not see the error message