langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
https://dify.ai
Other
36.08k stars 4.87k forks source link

Sandbox cuts off part of the response #5231

Open dailydaniel opened 2 weeks ago

dailydaniel commented 2 weeks ago

Self Checks

Dify version

0.6.10

Cloud or Self Hosted

Self Hosted (Source)

Steps to reproduce

Creating a work flow with a code block (Python) and as input variable string with large text

✔️ Expected Behavior

Сorrect code execution in the sandbox and response parsing

❌ Actual Behavior

Сutting off part of the response from the sandbox, which is why the regular expression in the transform_response function does not work (the part of the response with the closing RESULT tag is cut off) and the error “CodeBlock_name run failed: Failed to parse result” appears. Problem in sandbox.

dosubot[bot] commented 2 weeks ago

Hello, @dailydaniel! I'm here to help you with any bugs, questions, or contributions you have for the repository. I'll assist you while you wait for a human maintainer.

The issue you're experiencing with the sandbox cutting off part of the response when using a large text string as an input variable is likely due to the CODE_MAX_STRING_LENGTH configuration. According to the provided docker-compose.yaml file, the CODE_MAX_STRING_LENGTH is set to 80000.

To resolve this, you can increase the CODE_MAX_STRING_LENGTH to a higher value that accommodates your input size. Here's how you can modify the docker-compose.yaml file:

services:
  api:
    environment:
      # other configurations...
      CODE_MAX_STRING_LENGTH: 160000  # Increase this value as needed
      # other configurations...
  worker:
    environment:
      # other configurations...
      CODE_MAX_STRING_LENGTH: 160000  # Ensure this matches the api service
      # other configurations...
  sandbox:
    environment:
      # other configurations...
      CODE_MAX_STRING_LENGTH: 160000  # Ensure this matches the api and worker services
      # other configurations...

After making this change, restart your services to apply the new configuration. This should prevent the sandbox from cutting off part of the response and allow the transform_response function to parse the result correctly.

Additionally, ensure that the CODE_EXECUTION_TIMEOUT is set appropriately in your code_executor.py file to handle longer execution times if needed:

CODE_EXECUTION_TIMEOUT = (10, 60)  # Adjust the timeout values as necessary

Moreover, the code_node.py file contains the MAX_STRING_LENGTH constant, which is used to validate the length of strings. Ensure that this constant is set correctly:

MAX_STRING_LENGTH = int(os.environ.get('CODE_MAX_STRING_LENGTH', '80000'))

By increasing the CODE_MAX_STRING_LENGTH in your environment configuration, you should be able to handle larger input strings without truncation, thus avoiding the issue with the transform_response function [1][2][3].

To continue talking to Dosu, mention @dosu.

dailydaniel commented 2 weeks ago

There is no CODE_MAX_STRING_LENGTH in docker-compose.yaml in the sandbox section. I also increased this variable everywhere (in the environment and other files), the result did not change.

dosubot[bot] commented 2 weeks ago

To address the issue of the sandbox cutting off part of the response, even after increasing the CODE_MAX_STRING_LENGTH in all relevant places, you should also consider the following:

  1. Verify Sandbox Configuration: Ensure that the sandbox service itself is configured to handle large strings. Specifically, check the WORKER_TIMEOUT setting in the docker-compose.yaml file to allow sufficient time for processing large strings.

    sandbox:
      image: langgenius/dify-sandbox:0.2.1
      restart: always
      environment:
        API_KEY: dify-sandbox
        GIN_MODE: 'release'
        WORKER_TIMEOUT: 60  # Increase this value if needed
        ENABLE_NETWORK: 'true'
        HTTP_PROXY: 'http://ssrf_proxy:3128'
        HTTPS_PROXY: 'http://ssrf_proxy:3128'
        SANDBOX_PORT: 8194
      volumes:
        - ./volumes/sandbox/dependencies:/dependencies
      networks:
        - ssrf_proxy_network
  2. Check for Maximum Payload Size: Ensure that there are no other maximum payload size or request size limits configured in the sandbox service or other parts of the Dify infrastructure. The CODE_MAX_STRING_LENGTH and TEMPLATE_TRANSFORM_MAX_LENGTH should be set appropriately.

    CODE_MAX_STRING_LENGTH: 160000
    TEMPLATE_TRANSFORM_MAX_LENGTH: 160000
  3. Logging and Debugging: Enable detailed logging to get more insights into why the response is being truncated. Change the GIN_MODE to 'debug' in the sandbox service configuration.

    GIN_MODE: 'debug'
  4. Nginx Configuration: Ensure that the Nginx configuration allows for large request bodies. The client_max_body_size should be set to a value that can accommodate your large text strings.

    client_max_body_size 15M;
  5. Code Adjustments: Verify that the code_executor.py and code_node.py files are correctly handling the CODE_MAX_STRING_LENGTH setting. Ensure that the _check_string method in code_node.py is properly configured.

    MAX_STRING_LENGTH = int(os.environ.get('CODE_MAX_STRING_LENGTH', '160000'))
    def _check_string(self, value: str, variable: str) -> str:
        if not isinstance(value, str):
            raise ValueError(f"Output variable `{variable}` must be a string")
    
        if len(value) > MAX_STRING_LENGTH:
            raise ValueError(f'The length of output variable `{variable}` must be less than {MAX_STRING_LENGTH} characters')
    
        return value.replace('\x00', '')

By ensuring these configurations and settings are correctly applied, you should be able to handle larger input strings without truncation, thus avoiding the issue with the transform_response function [1][2].

To continue talking to Dosu, mention @dosu.

mkaskov commented 2 weeks ago

I have the same issue with long data