langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
https://dify.ai
Other
50.97k stars 7.33k forks source link

"Output Anomalies in DIFY When Handling Long Content: Issues with Text Repetition and Formatting" #7075

Open mengdahuang opened 3 months ago

mengdahuang commented 3 months ago

Self Checks

Dify version

0.6.16

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

Dear DIFY Team,

Fault Description: When DIFY outputs content, if the content is too long, the text and formatting at the end become uncontrollable (such as repetition, garbled text, excessive spacing, and other anomalies). iShot_2024-08-08_09 52 56 iShot_2024-08-08_09 50 45

Reproduction Steps: Initially, I suspected it was an issue with the capabilities of the gpt-4o model. However, when I imported the same knowledge document into both DIFY and FASTGPT and requested both to output all content from the knowledge base (or a specific point, as long as there is enough content), I found that: FASTGPT's output was normal, but DIFY's output was abnormal. iShot_2024-08-08_09 46 00

So, I conducted a second round of testing:

I integrated the normal output from FASTGPT as an API into DIFY's workflow (using DIFY's HTTP request module). I disabled DIFY's own knowledge base and directly used the FASTGPT API via HTTP requests. At this point, I found that the FASTGPT API output was normal, but the content transmitted to DIFY still exhibited the aforementioned anomalies. iShot_2024-08-08_09 54 52


"I have to add that this is not just a web rendering issue. By using the workflow mentioned above, I published DIFY as an application and queried it through the DIFY API with the same long text content. I also encountered formatting issues, typos, and other anomalies as described above."

Look forward to your reply, thank you!

✔️ Expected Behavior

No response

❌ Actual Behavior

No response

dosubot[bot] commented 3 months ago

To resolve the issue of DIFY outputting long content with text repetition, garbled text, excessive spacing, and other anomalies in version 0.6.16 when self-hosted using Docker, you should ensure that your Docker setup and environment variables are correctly configured. Follow these steps:

  1. Check Docker and Docker Compose Installation: Ensure that Docker and Docker Compose are properly installed on your machine.

    docker --version
    docker-compose --version
  2. Update Environment Variables: Copy the example environment file and update it with your specific configurations.

    cd docker
    cp .env.example .env

    Edit the .env file to match your deployment needs.

  3. Run Docker Compose: Start the Dify server using Docker Compose.

    docker compose up -d
  4. Access Dify Console: After running the Docker Compose command, access the Dify console in your browser to complete the installation.

    http://localhost/install
  5. Custom Configuration: If you need custom configurations, update the docker-compose.yaml file and the .env file accordingly. After making changes, restart the Docker Compose.

    docker-compose up -d
  6. Check for Updates and Patches: Ensure you are using the latest version of Dify and check for any patches or updates that might address the issue [1].

Additionally, Dify includes observability features that allow you to monitor and analyze application logs and performance over time. This can help identify the root cause of issues such as text repetition, garbled text, and excessive spacing [2].

To continue talking to Dosu, mention @dosu.

xrenaissance commented 3 months ago

I also had same issue as well.

mengdahuang commented 3 months ago

I also had same issue as well.

Looks like I've found a new friend on this issue!

Phillee80 commented 3 months ago

I also had same issue as well.

Looks like I've found a new friend on this issue!

Friend +1

mengdahuang commented 2 months ago

Dear Dify Team,

any update on this issue?

Thanks With Best Regards,

mengdahuang commented 1 month ago

@takatost Hi, cloud you please help to check this issue ? thanks a lot.

Yevanchen commented 2 weeks ago

Hi, we are unable to reproduce this situation. "Reproduction steps: Initially, I suspected it was an issue with the capabilities of the gpt-4o model. However, when I imported the same knowledge document into DIFY and FASTGPT and requested both to output all content from the knowledge base (or specific points as long as there is enough content)" I'm not sure if I've missed some details in this step; I hope you can provide further clarification or DSL.

mengdahuang commented 1 week ago

Hi, we are unable to reproduce this situation. "Reproduction steps: Initially, I suspected it was an issue with the capabilities of the gpt-4o model. However, when I imported the same knowledge document into DIFY and FASTGPT and requested both to output all content from the knowledge base (or specific points as long as there is enough content)" I'm not sure if I've missed some details in this step; I hope you can provide further clarification or DSL.

Hi, Thanks for your reply.

This is unrelated to DSL; it's a matter of the length of the knowledge base content. As long as the response content related to your question is too long, this issue will arise. You can test it out. For example, if you create a knowledge base question about internal discounts at company xxx and then paste a relatively long piece of text as an answer, just ensure that when the question hits, the response is that entire paragraph—if it's long enough, it will exhibit the situation shown in the previous screenshot. However, with the same content and model on fastgpt, this problem does not occur, so it’s not related to the model.