Issue with Increased API Call Time When Using Code Execution in Workflow

hzho2000 commented 1 month ago

Self Checks

[X] I have searched for existing issues search for existing issues, including closed ones.
[X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
[X] 请务必使用英文提交 Issue，否则会被关闭。谢谢！:）
[X] Please do not modify this template :) and fill in all the required fields.

1. Is this request related to a challenge you're experiencing? Tell me about your story.

I really like the code execution feature within the workflow. This feature allows me to freely write Python code to check if the current generated data is reasonable. However, I have noticed that even for very simple functions, if there is a node used in the workflow, it results in an increased API call time of 100ms. I want to understand whether there is a persistent Python service when I enable the workflow service or if a new Python instance is started each time I call the API.

2. Additional context or comments

This feature is very useful for data validation and quality checks, and the delay in API response time can impact performance significantly. Any documentation or explanation about the underlying mechanism would be greatly appreciated.

3. Can you help us with this feature?

[X] I am interested in contributing to this feature.

dosubot[bot] commented 1 month ago

Hey there, @hzho2000! I'm here to help you with any bugs, questions, or contributions you have in mind. Let's tackle this issue together!

The increased API call time you're experiencing when using the code execution feature in workflows is likely due to the initialization of a new Python instance for each API call. This can introduce overhead, especially if the environment needs to be set up from scratch each time.

Currently, there is no specific documentation indicating whether Dify uses a persistent Python interpreter or spawns a new instance for each API call ^[1]. However, there are configuration options available to optimize the initialization time for Python code execution within workflows. You can adjust the WORKER_TIMEOUT parameter in the docker-compose.middleware.yaml file to potentially reduce startup time ^[2]^[3].

Additionally, Dify provides built-in mechanisms for profiling or logging the execution time of individual code nodes within a workflow. This can help you identify and optimize performance bottlenecks ^[4].

For more detailed information on how Dify handles the execution of Python code within workflows, you can refer to the documentation on code nodes ^[5]^[1].

To continue talking to Dosu, mention @dosu.

crazywoola commented 1 month ago

Regarding the service, I guess you are referring to this one https://github.com/langgenius/dify-sandbox. This is a standalone service which runs in background and accepts the data from the api.

Here is the call stack. api -> ssrf_proxy -> code_sandbox

langgenius / dify