langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
https://dify.ai
Other
47.1k stars 6.69k forks source link

Response format for Gemini models is answering in json format with Vertex AI #4649

Closed AndyMik90 closed 2 months ago

AndyMik90 commented 4 months ago

Self Checks

Dify version

0.6.8

Cloud or Self Hosted

Cloud, Self Hosted (Docker), Self Hosted (Source)

Steps to reproduce

Use Gemini models and see response. CleanShot 2024-05-24 at 19 15 13

✔️ Expected Behavior

Normal response from Gemini models.

❌ Actual Behavior

Gemini 1.5 Flash and Pro answers back in json format. Gemini 1.0 Vision gave error: [vertex_ai] Bad Request Error, 400 The input system_instruction is not supported.

dosubot[bot] commented 4 months ago

To address the issue where Gemini models in Dify version 0.6.8 respond in JSON format or return errors, you can use the agent mode with plugins. This allows the model to use external tools to provide more accurate and contextually appropriate responses.

Here is how you can configure and use plugins to improve the conversation flow:

  1. Use Plugins:

    • Google Search Plugin: This plugin allows the model to search Google for answers.
    • Web Reader Plugin: This plugin reads the content of linked web pages.
    • Wikipedia Plugin: This plugin searches Wikipedia for answers.
  2. Configure Plugins:

    • For example, if you use the Google search plugin, you need to configure the SerpAPI key.
  3. Select Plugins Before Conversation:

    • Choose the necessary plugins before starting the conversation to ensure the model can use them when needed.
  4. Use Knowledge:

    • Select the relevant knowledge base for the conversation. The model will use this data set to find answers related to the user's questions.

By configuring and using these plugins and knowledge bases, you can enhance the model's ability to provide natural and accurate responses, thereby improving the overall conversation flow [1].

Additionally, you can customize the response formats by composing system prompts and setting the context referenced in the prompts. This includes customizing the content of system prompts, user messages, and assistant messages, and inserting special variable blocks like "Conversation History" and "Context" to enrich conversational functionalities [2].

#### Details **About Dosu** This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

AndyMik90 commented 4 months ago

I also suggest setting HarmBlockThreshold to "BLOCK_ONLY_HIGH" as most will get "google.api_core.exceptions.InvalidArgument: 400 User has requested a restricted HarmBlockThreshold setting BLOCK_NONE."

guchenhe commented 4 months ago

system_instruction is a valid parameter according to Google's latest API doc: https://cloud.google.com/python/docs/reference/aiplatform/latest/vertexai.generative_models.GenerativeModel. I suspect its a server side bug, there's also a similar issue raised with the vertex AI's nodeJS SDK: https://github.com/googleapis/nodejs-vertexai/issues/350.

hellof20 commented 4 months ago

4971 remove Vertex AI gemini model default System Instructions prompt will fix this.