infiniflow / ragflow

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
https://ragflow.io
Apache License 2.0
22.85k stars 2.24k forks source link

[Bug]: Issue error on recall testing #2367

Open gaspire opened 2 months ago

gaspire commented 2 months ago

Is there an existing issue for the same bug?

Branch name

main

Commit ID

main

Other environment information

No response

Actual behavior

BadRequestError('search_phase_execution_exception', meta=ApiResponseMeta(status=400, http_version='1.1', headers={'X-elastic-product': 'Elasticsearch', 'content-type': 'application/vnd.elasticsearch+json;compatible-with=8', 'content-length': '783'}, duration=0.008754491806030273, node=NodeConfig(scheme='http', host='es01', port=9200, path_prefix='', headers={'user-agent': 'elasticsearch-py/8.12.1 (Python/3.10.12; elastic-transport/8.12.0)'}, connections_per_node=10, request_timeout=10.0, http_compress=False, verify_certs=False, ca_certs=None, client_cert=None, client_key=None, ssl_assert_hostname=None, ssl_assert_fingerprint=None, ssl_version=None, ssl_context=None, ssl_show_warn=True, _extras={})), body={'error': {'root_cause': [{'type': 'query_shard_exception', 'reason': 'failed to create query: field [q_1024_vec] does not exist in the mapping', 'index_uuid': 'zhHfqe-zQmWQqUdagWlmhg', 'index': 'ragflow_0097485c700311efac900242ac170006'}], 'type': 'search_phase_execution_exception', 'reason': 'all shards failed', 'phase': 'dfs', 'grouped': True, 'failed_shards': [{'shard': 0, 'index': 'ragflow_0097485c700311efac900242ac170006', 'node': 'qtLNF89dT06AMSsPc96ZXw', 'reason': {'type': 'query_shard_exception', 'reason': 'failed to create query: field [q_1024_vec] does not exist in the mapping', 'index_uuid': 'zhHfqe-zQmWQqUdagWlmhg', 'index': 'ragflow_0097485c700311efac900242ac170006', 'caused_by': {'type': 'illegal_argument_exception', 'reason': 'field [q_1024_vec] does not exist in the mapping'}}}]}, 'status': 400})

Expected behavior

No response

Steps to reproduce

期望正常

Additional information

No response

KevinHuSh commented 2 months ago

Click document in knowledge base, make sure they're parsed successfuly.

baojingyu commented 2 months ago

me too , Issue error on recall testing !

BadRequestError('search_phase_execution_exception', meta=ApiResponseMeta(status=400, http_version='1.1', headers={'X-elastic-product': 'Elasticsearch', 'content-type': 'application/vnd.elasticsearch+json;compatible-with=8', 'content-length': '780'}, duration=0.10873889923095703, node=NodeConfig(scheme='http', host='es01', port=9200, path_prefix='', headers={'user-agent': 'elasticsearch-py/8.12.1 (Python/3.10.12; elastic-transport/8.12.0)'}, connections_per_node=10, request_timeout=10.0, http_compress=False, verify_certs=False, ca_certs=None, client_cert=None, client_key=None, ssl_assert_hostname=None, ssl_assert_fingerprint=None, ssl_version=None, ssl_context=None, ssl_show_warn=True, _extras={})), body={'error': {'root_cause': [{'type': 'query_shard_exception', 'reason': 'failed to create query: field [q_768_vec] does not exist in the mapping', 'index_uuid': 'e_Rb3luUTQiRBdpDWl6HPQ', 'index': 'ragflow_ad456d9c71b411efba7a0242ac120006'}], 'type': 'search_phase_execution_exception', 'reason': 'all shards failed', 'phase': 'dfs', 'grouped': True, 'failed_shards': [{'shard': 0, 'index': 'ragflow_ad456d9c71b411efba7a0242ac120006', 'node': 'ZnTeYYAVS0S_QYXD14c0UQ', 'reason': {'type': 'query_shard_exception', 'reason': 'failed to create query: field [q_768_vec] does not exist in the mapping', 'index_uuid': 'e_Rb3luUTQiRBdpDWl6HPQ', 'index': 'ragflow_ad456d9c71b411efba7a0242ac120006', 'caused_by': {'type': 'illegal_argument_exception', 'reason': 'field [q_768_vec] does not exist in the mapping'}}}]}, 'status': 400})

AbelCastelao commented 1 month ago

Same error: BadRequestError('search_phase_execution_exception', meta=ApiResponseMeta(status=400, http_version='1.1', headers={'X-elastic-product': 'Elasticsearch', 'content-type': 'application/vnd.elasticsearch+json;compatible-with=8', 'content-length': '810'}, duration=0.005482673645019531, node=NodeConfig(scheme='http', host='es01', port=9200, path_prefix='', headers={'user-agent': 'elasticsearch-py/8.12.1 (Python/3.10.12; elastic-transport/8.12.0)'}, connections_per_node=10, request_timeout=10.0, http_compress=False, verify_certs=False, ca_certs=None, client_cert=None, client_key=None, ssl_assert_hostname=None, ssl_assert_fingerprint=None, ssl_version=None, ssl_context=None, ssl_show_warn=True, _extras={})), body={'error': {'root_cause': [{'type': 'query_shard_exception', 'reason': 'failed to create query: [knn] queries are only supported on [dense_vector] fields', 'index_uuid': 'cUaTm9PHTCO_byjYLuMllA', 'index': 'ragflow_0e784c16774011ef99df0242ac120006'}], 'type': 'search_phase_execution_exception', 'reason': 'all shards failed', 'phase': 'dfs', 'grouped': True, 'failed_shards': [{'shard': 0, 'index': 'ragflow_0e784c16774011ef99df0242ac120006', 'node': 'O6t2x9VjS2WlzJrzH1hjEA', 'reason': {'type': 'query_shard_exception', 'reason': 'failed to create query: [knn] queries are only supported on [dense_vector] fields', 'index_uuid': 'cUaTm9PHTCO_byjYLuMllA', 'index': 'ragflow_0e784c16774011ef99df0242ac120006', 'caused_by': {'type': 'illegal_argument_exception', 'reason': '[knn] queries are only supported on [dense_vector] fields'}}}]}, 'status': 400})

Documents parsed SUCCESS

KevinHuSh commented 1 month ago

Check the ES status? What about the chunk list by clicking the documents?

AbelCastelao commented 1 month ago

The chunks seems ok, the process: Progress Msg: Task has been received. Page(1-13): OCR is running... Page(1-13): OCR finished Page(1-13): Layout analysis finished. Page(1-13): Table analysis finished. Page(1-13): Text merging finished Page(1-13): Finished slicing files(34). Start to embedding the content. Page(1-13): Finished embedding(153.96)! Start to build index! Page(1-13): Done! Task has been received. Page(13-25): OCR is running... Page(13-25): OCR finished Page(13-25): Layout analysis finished. Page(13-25): Table analysis finished. Page(13-25): Text merging finished Page(13-25): Finished slicing files(40). Start to embedding the content. Page(13-25): Finished embedding(185.27)! Start to build index! Page(13-25): Done! Task has been received. Page(25-37): OCR is running... Page(25-37): OCR finished Page(25-37): Layout analysis finished. Page(25-37): Table analysis finished. Page(25-37): Text merging finished Page(25-37): Finished slicing files(38). Start to embedding the content. Page(25-37): Finished embedding(178.34)! Start to build index! Page(25-37): Done! Task has been received. Page(37-49): OCR is running... Page(37-49): OCR finished Page(37-49): Layout analysis finished. Page(37-49): Table analysis finished. Page(37-49): Text merging finished Page(37-49): Finished slicing files(34). Start to embedding the content. Page(37-49): Finished embedding(166.01)! Start to build index! Page(37-49): Done! Task has been received. Page(49-61): OCR is running... Page(49-61): OCR finished Page(49-61): Layout analysis finished. Page(49-61): Table analysis finished. Page(49-61): Text merging finished Page(49-61): Finished slicing files(37). Start to embedding the content. Page(49-61): Finished embedding(235.59)! Start to build index! Page(49-61): Done! Task has been received. Page(61-64): OCR is running... Page(61-64): OCR finished Page(61-64): Layout analysis finished. Page(61-64): Table analysis finished. Page(61-64): Text merging finished Page(61-64): Finished slicing files(8). Start to embedding the content. Page(61-64): Finished embedding(36.90)! Start to build index! Page(61-64): Done!

Maybe is related with: https://github.com/langchain-ai/langchain/discussions/12425

Many Thanks!