langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
https://dify.ai
Other
35.76k stars 4.82k forks source link

Issue with OpenAPI Object Parsing and Chaining Agents with Workflows #4665

Open agrigeorge opened 1 month ago

agrigeorge commented 1 month ago

Self Checks

Dify version

0.6.8

Cloud or Self Hosted

Cloud

Steps to reproduce

Define an OpenAPI specification for the tool with an inputs field designated as an object. Add this specification as a tool to an agent in dify.ai. Observe that the input is interpreted as a string rather than an object.

✔️ Expected Behavior

The inputs field defined in the OpenAPI specification should be interpreted as an object, enabling the correct configuration and data exchange between tools in the workflow.

❌ Actual Behavior

The inputs field is incorrectly interpreted as a string, which disrupts the proper integration and functionality of the tool within the workflow.

Additional Context

I am aware that currently, dify.ai does not support chaining multiple agents together or allowing agents direct access to workflows, though it appears this functionality may be planned for future implementation. In light of this limitation, I have opted to use a workaround by integrating individual agents via an OpenAPI specification, treating them as tools within each agent.

To manage the integration, I constructed an OpenAPI specification and incorporated it as a tool for each agent. The workflow configuration requires the inputs field to be structured as an object. However, there is an ongoing issue where dify.ai interprets these object fields as strings. This problem was previously reported in multiple issues and was considered resolved. Despite this, my implementation is still encountering this interpretation error.

Thank you for your time and help with this issue. Please let me know if there's anything I might be missing or any additional steps I should consider to resolve this problem. Your assistance is greatly appreciated!

openapi: "3.1.0"
info:
  title: "Agent Workflow API"
  version: "1.0.0"
  description: ""
servers:
  - url: "https://api.dify.ai/v1"
paths:
  /workflows/run:
    post:
      operationId: "processAgent"
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: "object"
              properties:
                inputs:
                  type: "object"
                  properties:
                    query:
                      type: "string"
                response_mode:
                  type: "string"
                  default: "blocking"
                conversation_id:
                  type: "string"
                user:
                  type: "string"
              required:
                - "response_mode"
                - "conversation_id"
                - "user"
                - "inputs"
      responses:
        200:
          description: "Successful operation"
          content:
            application/json:
              schema:
                type: "object"
                properties:
                  status:
                    type: "string"
                  data:
                    type: "object"
                    additionalProperties: true
Screenshot 2024-05-26 at 20 09 35

related issues: https://github.com/langgenius/dify/issues/3421 https://github.com/langgenius/dify/issues/3796 https://github.com/langgenius/dify/issues/2315

dosubot[bot] commented 1 month ago

Hey @agrigeorge! I'm here to help you out with any bugs, questions, or contributions you have for the project. Let's work together to tackle this issue!

To resolve the issue where the inputs field in your OpenAPI specification is being interpreted as a string instead of an object in Dify, ensure that the type of the inputs field is correctly specified in the OpenAPI schema. The ApiBasedToolSchemaParser class uses the _get_tool_parameter_type method to determine the type of parameters.

Here's a snippet from the code that shows how the type is determined:

@staticmethod
def _get_tool_parameter_type(parameter: dict) -> ToolParameter.ToolParameterType:
    parameter = parameter or {}
    typ = None
    if 'type' in parameter:
        typ = parameter['type']
    elif 'schema' in parameter and 'type' in parameter['schema']:
        typ = parameter['schema']['type']

    if typ == 'integer' or typ == 'number':
        return ToolParameter.ToolParameterType.NUMBER
    elif typ == 'boolean':
        return ToolParameter.ToolParameterType.BOOLEAN
    elif typ == 'string':
        return ToolParameter.ToolParameterType.STRING
    elif typ == 'object':
        return ToolParameter.ToolParameterType.OBJECT

To ensure that the inputs field is interpreted as an object, define it in your OpenAPI specification with the type set to object. Here is an example:

components:
  schemas:
    YourSchema:
      type: object
      properties:
        inputs:
          type: object
          properties:
            query:
              type: string

By specifying the type as object for the inputs field, the parser will correctly interpret it as an object instead of a string [1].

Additionally, there is a known limitation in the current version (v0.6.8) of Dify that affects the interpretation of nested object types within the OpenAPI schema. The ApiBasedToolSchemaParser class does not handle nested object types [1].

#### Details **About Dosu** This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.