langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
https://dify.ai
Other
53.59k stars 7.85k forks source link

File upload related functions cannot be implemented by adding http request nodes. #9602

Open wangyingle666 opened 1 month ago

wangyingle666 commented 1 month ago

Self Checks

Dify version

0.10.0

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

The content of the document can be read by adding the node of the document extractor, but some documents, such as word, contain tables. Such documents cannot be processed by adding the node of the document extractor, so I want to add the node of the http request. deal with. But I am using node config for http request and I am not able to achieve

This is the interface I wrote: curl -X 'POST' \ 'http://172.17.60.18:18207/file_parse/' \ -H 'accept: application/json' \ -H 'Content-Type: multipart/form-data' \ -F 'files=@测试.txt;type=text/plain'

This is the http request node I configured image

This is done by adding http request node background log as follows, where does this "upload" come from?

image

This is the log directly called through curl image

✔️ Expected Behavior

I hope to extract complex document content by configuring the http request node to call the interface I wrote to add the attachment function.

❌ Actual Behavior

Configuring the http request interface cannot implement the document upload and parsing function

dosubot[bot] commented 1 month ago

To address the issue of file uploads not working as expected with the HTTP request node in Dify, there are a few things to consider:

  1. File Naming Discrepancy: The discrepancy in file names between your curl command and the HTTP request node might be due to how the filename is being set in the HTTP request node. In Dify, the filename is typically derived from the URL's basename or guessed from the content type. If the filename is appearing as "upload" in the HTTP request node, it might be due to the way the file path or name is being specified in the node configuration [1].

  2. Configuration of HTTP Request Node: Ensure that the HTTP request node is correctly configured to handle multipart form-data. The Content-Type should be set to multipart/form-data, and the file should be included in the form-data with the correct key. The BodyData object should have its type set to "file" and the file attribute should point to the correct file path or identifier [2].

  3. Boundary and Headers: The Executor class in Dify handles multipart form-data by setting the Content-Type header with a boundary. Ensure that this is correctly set in your configuration. The boundary is crucial for separating parts of the form data [3].

  4. File Upload Handling: If the HTTP request node is not handling the file upload as expected, double-check the file path and ensure that the file is accessible and correctly specified in the node configuration. Also, verify that the server endpoint is correctly set up to receive multipart form-data and process the file accordingly.

By ensuring these configurations are correct, you should be able to achieve the desired file upload and parsing functionality using the HTTP request node in Dify.

To continue talking to Dosu, mention @dosu.

wangyingle666 commented 1 month ago

I need help ><

dosubot[bot] commented 1 week ago

Hi, @wangyingle666. I'm Dosu, and I'm helping the Dify team manage their backlog and am marking this issue as stale.

Issue Summary

Next Steps

Thank you for your understanding and contribution!