langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
https://dify.ai
Other
50.36k stars 7.22k forks source link

Workflow Step with Gemini Model Not Finishing Properly, Missing finish_reason and Usage Data #9870

Open nguyenphan opened 1 week ago

nguyenphan commented 1 week ago

Self Checks

Dify version

0.10.1

Cloud or Self Hosted

Cloud

Steps to reproduce

  1. Create a workflow using Dify with a step that utilizes the Gemini model (gemini-1.5-pro-latest).
  2. Configure the step to add translation notes based on an input list of translations and a reference image.
  3. Run the workflow with the above step.
  4. Observe that the step returns a status of succeeded, but the finish_reason is null and the usage data (e.g., total_tokens) is zero.

✔️ Expected Behavior

The workflow step should complete with a proper finish_reason and accurate usage data, similar to the behavior seen with other models like gpt4o-mini. The usage object should reflect the tokens used and provide correct pricing information.

❌ Actual Behavior

• The workflow step runs and returns a status of succeeded, but: • The finish_reason is null. • All values in the usage object (e.g., prompt_tokens, completion_tokens, total_tokens, total_price) are zero. • The response contains incomplete output, cutting off mid-sentence. • This issue occurs only when using the Gemini model (gemini-1.5-pro-latest). The same workflow runs correctly when using the gpt4o-mini model.

Example of the node output:

{
    "id": "391decfc-743c-4171-a7a5-993b92ea3b15",
    "index": 1,
    "predecessor_node_id": null,
    "node_id": "1729500375254",
    "node_type": "llm",
    "title": "Add translation notes",
    "inputs": null,
    "process_data": {
        "model_mode": "chat",
        "prompts": [
            {
                "role": "system",
                "text": "```xml\n<instruction>\n  <task_description>\n    ***  </task_description>\n\n  <instructions>\n   *** \n  </instructions>\n\n  <examples>\n    <example>\n      Input:\n      Translations:\n{text-1}\nLorem ipsum dolor sit amet, consectetur adipiscing elit.\n\n{text-2}\nSed do eiusmod tempor incididunt ut labore et dolore magna aliqua.\n\n      Output:\n{text-1}\nLorem ipsum dolor sit amet, consectetur adipiscing elit.\n[Note: The English text at the top should be translated concisely to fit within the design. Consider using a shorter tagline that captures the essence of the message, such as \\\"Adipiscing elit.\\\" or \\\"Dolor sit amet.\\\"]\n\n{text-2}\nSed do eiusmod tempor incididunt ut labore et dolore magna aliqua.\n[Note: No change needed]\n    </example>\n  </examples>\n</instruction>\n```",
                "files": []
            },
            {
                "role": "user",
                "text": "Image: refer to the uploaded image file\nCurrent translations:\n{text-1}\\nWater-based paint crib\\n\\n{text-2}\\nOrdinary wooden crib\\n\\n{text-3}\\nDangerous for babies to bite and put in mouth\\n\\n{text-4}\\nSingle function, not very practical\\n\\n{text-5}\\nIron folding bed\\n\\n{text-6}\\nComposite board bed\\n\\n{text-7}\\nProne to rust\\n\\n{text-8}\\nIron parts are prone to rust and endanger health\\n\\n{text-9}\\nContains formaldehyde\\n\\n{text-10}\\nThe bed board contains glue and formaldehyde, which seriously exceeds the standard\\n\\n{text-11}\\nEasy to shake\\n\\n{text-12}\\nShaky and unsafe, the bed body tube is thin\\n\\n{text-13}\\nEasy to deform\\n\\n{text-14}\\nProne to dampness, illness, not waterproof, easy to deform\\n\\n{text-15}\\nMultifunctional four-in-one\\nOne bed meets multiple needs",
                "files": []
            }
        ],
        "model_provider": "google",
        "model_name": "gemini-1.5-pro-latest"
    },
    "outputs": {
        "text": "{text-1}\nWater-based paint crib\n[Note: No change needed]\n\n{text-2}\nOrdinary wooden crib\n[Note:",
        "usage": {
            "prompt_tokens": 0,
            "prompt_unit_price": "0.0",
            "prompt_price_unit": "0.0",
            "prompt_price": "0.0",
            "completion_tokens": 0,
            "completion_unit_price": "0.0",
            "completion_price_unit": "0.0",
            "completion_price": "0.0",
            "total_tokens": 0,
            "total_price": "0.0",
            "currency": "USD",
            "latency": 0.0
        },
        "finish_reason": null
    },
    "status": "succeeded",
    "error": null,
    "elapsed_time": 1.6948861805722117,
    "execution_metadata": {
        "total_tokens": 0,
        "total_price": "0.0",
        "currency": "USD"
    },
    "extras": {},
    "created_at": 1729848840,
    "created_by_role": "account",
    "created_by_account": {
        "id": "3fe4603a-f8a6-4de9-a92d-111172caa548",
        "name": "Dify",
        "email": "nguyen@inkr.com"
    },
    "created_by_end_user": null,
    "finished_at": 1729848840
}

This issue started occurring after the v0.10 update and only affects workflows using the Gemini model.

dosubot[bot] commented 1 week ago

The issue you're experiencing with the Gemini model in Dify, where the workflow step returns a status of 'succeeded' but lacks a finish_reason and has zero usage data, might be related to changes introduced in the v0.10 update. This update replaced the "Image Upload" feature with a more comprehensive "File Upload" feature, marking the image upload option as LEGACY and planning for its future deprecation [1].

To address this, consider the following steps:

  1. Check File Upload Configuration: Ensure that your workflow is using the new "File Upload" feature correctly, as the legacy image upload might not be fully supported anymore.

  2. Review Workflow Logic: Double-check the logic and configuration of your workflow step to ensure it aligns with the new update's requirements.

  3. Debugging and Logging: Since the issue seems specific to the Gemini model, review any available logs or debugging information that might provide more insight into why the finish_reason and usage data are not being populated.

If these steps do not resolve the issue, it might be necessary to look into the specific configurations or error handling mechanisms for the Gemini model, although specific details on these were not found in the available documentation.

To continue talking to Dosu, mention @dosu.

nguyenphan commented 1 week ago

I don't think it has anything to do with the image upload though. I used the GUI to test and did not provide any image in testing the step. Just by sending those input, the response is broken half way like it was prematurely ended somehow.