langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
https://dify.ai
Other
44.3k stars 6.2k forks source link

LLM node cannot stopped with ollama #6493

Open Eelahne opened 1 month ago

Eelahne commented 1 month ago

Self Checks

Dify version

0.6.14

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

image (ollama v0.2.7 running qwen2-7b on 4090 the 2nd slot, sometimes it just hung and fully occupied like this, for hours)

My workflow is to process long context (over 20K) with tens of LLM nodes iterations. LLM nodes are all on Ollama linux serving, updated with recent several versions of both Ollama and Dify, but the issue has been still there. The process wouldn't hung every time, it took chance like every two or three turns, one of these LLM node would hung. When this happened, it would never response, so the workflow is waiting until the timeout (already set to 3600s), or until I noticed and deleted the Ollama pod.

It was also suspected the issue could be led by Ollama, but i searched issues from both repo, no topic really the same.

✔️ Expected Behavior

Expect if there is solution, or anyone experience the same issue as mine

❌ Actual Behavior

I manually delete the ollama pod and wait for the restarting, then pray to work through for the next running. Serving the server.

dosubot[bot] commented 1 month ago

@Eelahne Hello there! I'm here to assist you with any bugs, questions, or contributions. Let's work together to solve this issue.

There are known issues with the Ollama integration in Dify that could cause LLM nodes to hang or fail during long-running processes. Specifically, when deploying Dify and Ollama using Docker, you might encounter connection errors due to the Docker container not being able to access the Ollama service when it is bound to localhost.

To resolve this, you need to expose the Ollama service to the network. You can set the OLLAMA_HOST environment variable to 0.0.0.0 to bind the service to all network interfaces. Additionally, you can use host.docker.internal instead of localhost to access the service from within the Docker container:

http://host.docker.internal:11434

For more detailed instructions on setting up the environment variables on different operating systems, refer to the Ollama documentation. Here are some specific steps for different operating systems:

Setting environment variables on Mac

If Ollama is run as a macOS application, environment variables should be set using launchctl:

  1. For each environment variable, call launchctl setenv.

    launchctl setenv OLLAMA_HOST "0.0.0.0"
  2. Restart the Ollama application.

  3. If the above steps are ineffective, you can use the following method:

    The issue lies within Docker itself, and to access the Docker host, you should connect to host.docker.internal. Therefore, replacing localhost with host.docker.internal in the service will make it work effectively.

    http://host.docker.internal:11434

Setting environment variables on Linux

If Ollama is run as a systemd service, environment variables should be set using systemctl:

  1. Edit the systemd service by calling systemctl edit ollama.service. This will open an editor.

  2. For each environment variable, add a line Environment under section [Service]:

    [Service]
    Environment="OLLAMA_HOST=0.0.0.0"
  3. Save and exit.

  4. Reload systemd and restart Ollama:

    systemctl daemon-reload
    systemctl restart ollama

Setting environment variables on Windows

On Windows, Ollama inherits your user and system environment variables.

  1. First, quit Ollama by clicking on it in the taskbar.

  2. Edit system environment variables from the control panel.

  3. Edit or create new variables for your user account for OLLAMA_HOST, OLLAMA_MODELS, etc.

  4. Click OK/Apply to save.

  5. Run ollama from a new terminal window.

By following these steps, you can ensure that your Dockerized Dify environment can successfully communicate with the Ollama service, potentially resolving the hanging issue [1][2][3].

To continue talking to Dosu, mention @dosu.

Eelahne commented 1 month ago

I am sure it is not because of connection between Dify and Ollama containers, they are both located in the same namespace of k8s, communicating each other without any resistance. FYI, I counted down the chance it happens, every turn including 5 iterations of 13 calls to LLM while every 1 failure out of 3 turns of workflow leads to around 0.5% of such failure possibility. So this seems to be a stability issue, right?

dosubot[bot] commented 1 month ago

To address the stability issue with LLM nodes hanging during long-running processes in your Dify setup, consider the following steps:

  1. Break Down Tasks: Divide complex tasks into smaller, more manageable steps using Dify's workflow capabilities. This reduces the load on individual nodes and enhances overall system stability.

  2. Use Logic Nodes: Incorporate various logic nodes such as code nodes, IF/ELSE nodes, merge nodes, and template conversion nodes. These nodes help manage the flow of tasks and ensure each step is completed before moving on to the next.

  3. Trigger by Time and Event: Utilize the ability to trigger workflows by time and event. This can help manage long-running processes by breaking them into scheduled tasks, reducing the risk of overloading the system.

  4. Debugging and Monitoring: Regularly debug workflows and monitor run history to identify and address any issues that may cause nodes to hang.

Here is an example of how you might structure a workflow to improve stability:

workflow:
  - name: Step 1
    type: code_node
    code: |
      # Your code here
      pass
  - name: Step 2
    type: if_else_node
    condition: |
      # Your condition here
      pass
    true_branch:
      - name: Sub-step 1
        type: code_node
        code: |
          # Your code here
          pass
    false_branch:
      - name: Sub-step 2
        type: code_node
        code: |
          # Your code here
          pass
  - name: Step 3
    type: merge_node
    branches: [Step 2.true_branch, Step 2.false_branch]
  - name: Step 4
    type: template_conversion_node
    template: |
      # Your template here
      pass

By structuring your workflow in this manner, you can ensure that each step is handled efficiently, reducing the likelihood of nodes hanging during long-running processes [1].

To continue talking to Dosu, mention @dosu.

laipz8200 commented 1 month ago

Could you please provide the runtime logs for Dify at the time the issue occurred? If the workflow ran with an uncaught error, there should be some error information recorded in the logs.

Eelahne commented 1 month ago

image I am suffering this issue right now, guess this is what you want, as you can see, it has been already idle (well GPU utilization shows 311W and fan in 100% speed, not totally idle) for long and nothing recoded into log. How could i jump out? I have to delete the ollama pod by force, so nothing recorded anyway