langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
https://dify.ai
Other
36.39k stars 4.92k forks source link

经济文档无法索引 #27

Closed charli117 closed 1 year ago

charli117 commented 1 year ago

经济模式上传文档分片索引失败,这是什么原因导致的呀? image

charli117 commented 1 year ago

报错如下: [2023-05-15 19:05:35,192: ERROR/MainProcess] consume document failed

Traceback (most recent call last):

File "/app/api/tasks/document_indexing_task.py", line 41, in document_indexing_task

 indexing_runner.run(document)

File "/app/api/core/indexing_runner.py", line 47, in run

 text_docs = self._load_data(document)

File "/app/api/core/indexing_runner.py", line 216, in _load_data

 text_docs = self._load_data_from_file(file_detail)

File "/app/api/core/indexing_runner.py", line 245, in _load_data_from_file

 self.storage.download(upload_file.key, filepath)

File "/app/api/extensions/ext_storage.py", line 75, in download

 client.download_file(self.bucket_name, filename, target_filepath)

File "/usr/local/lib/python3.10/site-packages/boto3/s3/inject.py", line 190, in download_file

 return transfer.download_file(

File "/usr/local/lib/python3.10/site-packages/boto3/s3/transfer.py", line 326, in download_file

 future.result()

File "/usr/local/lib/python3.10/site-packages/s3transfer/futures.py", line 103, in result

 return self._coordinator.result()

File "/usr/local/lib/python3.10/site-packages/s3transfer/futures.py", line 266, in result

 raise self._exception

File "/usr/local/lib/python3.10/site-packages/s3transfer/tasks.py", line 269, in _main

 self._submit(transfer_future=transfer_future, **kwargs)

File "/usr/local/lib/python3.10/site-packages/s3transfer/download.py", line 354, in _submit

 response = client.head_object(

File "/usr/local/lib/python3.10/site-packages/botocore/client.py", line 530, in _api_call

 return self._make_api_call(operation_name, kwargs)

File "/usr/local/lib/python3.10/site-packages/botocore/client.py", line 960, in _make_api_call

 raise error_class(parsed_response, operation_name)

botocore.exceptions.ClientError: An error occurred (404) when calling the HeadObject operation: Not Found

[2023-05-15 19:05:35,205: INFO/MainProcess] Task tasks.document_indexing_task.document_indexing_task[5dfdec63-a311-4650-b373-d3342f6fc3e5] succeeded in 0.09100910206325352s: None

takatost commented 1 year ago

配了 s3 吗?报错是找不到文件

charli117 commented 1 year ago

配置了,看到报错也是头一个想到这个问题 image

takatost commented 1 year ago

api 和 worker 服务都配置了吗,需要都能访问同一个 s3 bucket

charli117 commented 1 year ago

哦哦,那我搞错了,搞成了不同的bucket。还有个问题,在配置azure token时,提示:"MOCK: This provider is not supported yet.",现在还不支持Azure OpenAI么 企业微信截图_83ed53a2-728d-4d7b-b1e9-f3347cac2ae7

takatost commented 1 year ago

哦哦,那我搞错了,搞成了不同的bucket。还有个问题,在配置azure token时,提示:"MOCK: This provider is not supported yet.",现在还不支持Azure OpenAI么 企业微信截图_83ed53a2-728d-4d7b-b1e9-f3347cac2ae7

目前还不支持,正在设计对应界面

charli117 commented 1 year ago

哦哦,有什么临时办法,可以先使用Azure OpenAI么,我们只有这个

takatost commented 1 year ago

哦哦,有什么临时办法,可以先使用Azure OpenAI么,我们只有这个

暂时还不行,因为 Azure 他的模型不是固定的类似 gpt-3.5-turbo 这种,而是自定义的名字(deployment),因为我们需要额外设计界面和逻辑来实现调用,目前的版本改数据也是没法支持的,我们尽快对接上。

leoterry-ulrica commented 1 month ago

@takatost 是否提供接口可以预览或下载知识库源文件的?