langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
https://dify.ai
Other
50.05k stars 7.15k forks source link

Process rule is required error when call create_by_file to create document in a dataset! #7018

Closed bnuzhouwei closed 2 months ago

bnuzhouwei commented 2 months ago

Self Checks

Dify version

0.6.11

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

微信截图_20240806213428

✔️ Expected Behavior

No response

❌ Actual Behavior

{ "code": "invalid_param", "message": "Process rule is required", "status": 400 }

dosubot[bot] commented 2 months ago

Hey @bnuzhouwei! 👋 I'm here to help you with bugs, questions, and becoming a contributor. Let me know how I can assist you while waiting for a human maintainer.

To resolve the issue of receiving a 400 status code with the error message "Process rule is required" when making a POST request to create a document by file in Dify version 0.6.11, ensure that the process_rule parameter is correctly included and properly formatted in the request body.

Here is an example of how to structure your request:

curl --location --request POST 'https://dify.r.ironpy.cn/v1/datasets/4affe743-b13e-4de8-be77-5104980e4d76/document/create_by_file' \
--header 'Authorization: Bearer {api_key}' \
--form 'data="{\"indexing_technique\":\"high_quality\",\"process_rule\":{\"mode\":\"custom\",\"rules\":{\"pre_processing_rules\":[{\"id\":\"remove_extra_spaces\",\"enabled\":true},{\"id\":\"remove_urls_emails\",\"enabled\":true}],\"segmentation\":{\"separator\":\"###\",\"max_tokens\":500}}}}";type=text/plain' \
--form 'file=@"/path/to/file"'

Make sure that the process_rule object is properly formatted and included in the data field of the multipart form-data. The process_rule should contain the mode and rules as shown in the example. Additionally, ensure that the Authorization header includes your API-Key in the format Bearer {api_key} [1][2].

To continue talking to Dosu, mention @dosu.

crazywoola commented 2 months ago

See bot's reply.

bnuzhouwei commented 2 months ago

Thanks, problem sovled, but if create new file the original_document_id can't be set, otherwise will got an error.

Can we set the document_id in the api so that the the docuemnt can be create or update adaptively.