Sandbox cuts off part of the response

dailydaniel commented 2 weeks ago

Self Checks

[X] This is only for bug report, if you would like to ask a question, please head to Discussions.
[X] I have searched for existing issues search for existing issues, including closed ones.
[X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
[X] Please do not modify this template :) and fill in all the required fields.

Dify version

0.6.10

Cloud or Self Hosted

Self Hosted (Source)

Steps to reproduce

Creating a work flow with a code block (Python) and as input variable string with large text

✔️ Expected Behavior

Сorrect code execution in the sandbox and response parsing

❌ Actual Behavior

Сutting off part of the response from the sandbox, which is why the regular expression in the transform_response function does not work (the part of the response with the closing RESULT tag is cut off) and the error “CodeBlock_name run failed: Failed to parse result” appears. Problem in sandbox.

dosubot[bot] commented 2 weeks ago

Hello, @dailydaniel! I'm here to help you with any bugs, questions, or contributions you have for the repository. I'll assist you while you wait for a human maintainer.

The issue you're experiencing with the sandbox cutting off part of the response when using a large text string as an input variable is likely due to the CODE_MAX_STRING_LENGTH configuration. According to the provided docker-compose.yaml file, the CODE_MAX_STRING_LENGTH is set to 80000.

To resolve this, you can increase the CODE_MAX_STRING_LENGTH to a higher value that accommodates your input size. Here's how you can modify the docker-compose.yaml file:

services:
  api:
    environment:
      # other configurations...
      CODE_MAX_STRING_LENGTH: 160000  # Increase this value as needed
      # other configurations...
  worker:
    environment:
      # other configurations...
      CODE_MAX_STRING_LENGTH: 160000  # Ensure this matches the api service
      # other configurations...
  sandbox:
    environment:
      # other configurations...
      CODE_MAX_STRING_LENGTH: 160000  # Ensure this matches the api and worker services
      # other configurations...

After making this change, restart your services to apply the new configuration. This should prevent the sandbox from cutting off part of the response and allow the transform_response function to parse the result correctly.

Additionally, ensure that the CODE_EXECUTION_TIMEOUT is set appropriately in your code_executor.py file to handle longer execution times if needed:

CODE_EXECUTION_TIMEOUT = (10, 60)  # Adjust the timeout values as necessary

Moreover, the code_node.py file contains the MAX_STRING_LENGTH constant, which is used to validate the length of strings. Ensure that this constant is set correctly:

MAX_STRING_LENGTH = int(os.environ.get('CODE_MAX_STRING_LENGTH', '80000'))

By increasing the CODE_MAX_STRING_LENGTH in your environment configuration, you should be able to handle larger input strings without truncation, thus avoiding the issue with the transform_response function ^[1]^[2]^[3].

To continue talking to Dosu, mention @dosu.

dailydaniel commented 2 weeks ago

There is no CODE_MAX_STRING_LENGTH in docker-compose.yaml in the sandbox section. I also increased this variable everywhere (in the environment and other files), the result did not change.

dosubot[bot] commented 2 weeks ago

To address the issue of the sandbox cutting off part of the response, even after increasing the CODE_MAX_STRING_LENGTH in all relevant places, you should also consider the following:

Verify Sandbox Configuration: Ensure that the sandbox service itself is configured to handle large strings. Specifically, check the WORKER_TIMEOUT setting in the docker-compose.yaml file to allow sufficient time for processing large strings.

sandbox:
  image: langgenius/dify-sandbox:0.2.1
  restart: always
  environment:
    API_KEY: dify-sandbox
    GIN_MODE: 'release'
    WORKER_TIMEOUT: 60  # Increase this value if needed
    ENABLE_NETWORK: 'true'
    HTTP_PROXY: 'http://ssrf_proxy:3128'
    HTTPS_PROXY: 'http://ssrf_proxy:3128'
    SANDBOX_PORT: 8194
  volumes:
    - ./volumes/sandbox/dependencies:/dependencies
  networks:
    - ssrf_proxy_network

Check for Maximum Payload Size: Ensure that there are no other maximum payload size or request size limits configured in the sandbox service or other parts of the Dify infrastructure. The CODE_MAX_STRING_LENGTH and TEMPLATE_TRANSFORM_MAX_LENGTH should be set appropriately.
```
CODE_MAX_STRING_LENGTH: 160000
TEMPLATE_TRANSFORM_MAX_LENGTH: 160000
```
Logging and Debugging: Enable detailed logging to get more insights into why the response is being truncated. Change the GIN_MODE to 'debug' in the sandbox service configuration.
```
GIN_MODE: 'debug'
```
Nginx Configuration: Ensure that the Nginx configuration allows for large request bodies. The client_max_body_size should be set to a value that can accommodate your large text strings.
```
client_max_body_size 15M;
```

Code Adjustments: Verify that the code_executor.py and code_node.py files are correctly handling the CODE_MAX_STRING_LENGTH setting. Ensure that the _check_string method in code_node.py is properly configured.

MAX_STRING_LENGTH = int(os.environ.get('CODE_MAX_STRING_LENGTH', '160000'))

def _check_string(self, value: str, variable: str) -> str:
    if not isinstance(value, str):
        raise ValueError(f"Output variable `{variable}` must be a string")

    if len(value) > MAX_STRING_LENGTH:
        raise ValueError(f'The length of output variable `{variable}` must be less than {MAX_STRING_LENGTH} characters')

    return value.replace('\x00', '')

By ensuring these configurations and settings are correctly applied, you should be able to handle larger input strings without truncation, thus avoiding the issue with the transform_response function ^[1]^[2].

To continue talking to Dosu, mention @dosu.

mkaskov commented 2 weeks ago

I have the same issue with long data

langgenius / dify