openapi-json-schema-tools / openapi-json-schema-generator

OpenAPI JSON Schema Generator allows auto-generation of API client libraries with a focus on JSON schema given an OpenAPI document
Apache License 2.0
150 stars 15 forks source link

[REQ][python] Improve generated new input type for oneOf schema #445

Closed Marcelo00 closed 2 months ago

Marcelo00 commented 3 months ago

Bug Report Checklist

Description

When having an anyOf modifier for a schema that is used in a array schema, the resulting python class does not have the correct typing.

Given the following schemas

    ClassificationToReturn:
      properties:
        name:
          type: string
          minLength: 3
        displayName:
          type: string
        exportName:
          type: string
          nullable: true
          minLength: 1
        attributes:
          items:
            $ref: "#/components/schemas/ClassificationAttribute"
          type: array
    ClassificationAttribute:
      $ref: "#/components/schemas/SingleValueClassificationAttribute"
    SingleValueClassificationAttribute:
      properties:
        name:
          type: string
          minLength: 1
          pattern: ^[^.]+$
        displayName:
          type: string
          minLength: 1
        exportName:
          type: string
          minLength: 1
      required:
        - name
      type: object
      additionalProperties: false
    MultiValueClassificationAttribute:
      properties:
        name:
          type: string
          minLength: 1
        maximumCount:
          type: integer
          format: int32
          minimum: 0
      required:
        - name
      type: object
      additionalProperties: false

the created class (in classification_to_return.py) has the correct typing information

class AttributesTuple(
    typing.Tuple[
        classification_attribute.single_value_classification_attribute.SingleValueClassificationAttributeDict,
        ...
    ]
):

    def __new__(cls, arg: typing.Union[AttributesTupleInput, AttributesTuple], configuration: typing.Optional[schema_configuration.SchemaConfiguration] = None):
        return Attributes.validate(arg, configuration=configuration)
AttributesTupleInput = typing.Union[
    typing.List[
        typing.Union[
            classification_attribute.single_value_classification_attribute.SingleValueClassificationAttributeDictInput,
            classification_attribute.single_value_classification_attribute.SingleValueClassificationAttributeDict,
        ],
    ],
    typing.Tuple[
        typing.Union[
            classification_attribute.single_value_classification_attribute.SingleValueClassificationAttributeDictInput,
            classification_attribute.single_value_classification_attribute.SingleValueClassificationAttributeDict,
        ],
        ...
    ]
]

@dataclasses.dataclass(frozen=True)
class Attributes(
    schemas.Schema[schemas.immutabledict, AttributesTuple]
):
    types: typing.FrozenSet[typing.Type] = frozenset({tuple})
    items: typing.Type[classification_attribute.ClassificationAttribute] = dataclasses.field(default_factory=lambda: classification_attribute.ClassificationAttribute) # type: ignore
    type_to_output_cls: typing.Mapping[
        typing.Type,
        typing.Type
    ] = dataclasses.field(
        default_factory=lambda: {
            tuple: AttributesTuple
        }
    )

However, when we use an anyOf in ClassificationAttribute

      ClassificationToReturn:
        properties:
          name:
            type: string
            minLength: 3
          displayName:
            type: string
          exportName:
            type: string
            nullable: true
            minLength: 1
          attributes:
            items:
              $ref: "#/components/schemas/ClassificationAttribute"
            type: array
      ClassificationAttribute:
        anyOf:
          - $ref: "#/components/schemas/SingleValueClassificationAttribute"
          - $ref: "#/components/schemas/MultiValueClassificationAttribute"
      SingleValueClassificationAttribute:
        properties:
          name:
            type: string
            minLength: 1
            pattern: ^[^.]+$
          displayName:
            type: string
            minLength: 1
          exportName:
            type: string
            minLength: 1
        required:
          - name
        type: object
        additionalProperties: false
      MultiValueClassificationAttribute:
        properties:
          name:
            type: string
            minLength: 1
          maximumCount:
            type: integer
            format: int32
            minimum: 0
        required:
          - name
        type: object
        additionalProperties: false

we will get the wrong typing information in classification_to_return.py

class AttributesTuple(
    typing.Tuple[
        schemas.OUTPUT_BASE_TYPES,
        ...
    ]
):

    def __new__(cls, arg: typing.Union[AttributesTupleInput, AttributesTuple], configuration: typing.Optional[schema_configuration.SchemaConfiguration] = None):
        return Attributes.validate(arg, configuration=configuration)
AttributesTupleInput = typing.Union[
    typing.List[
        typing.Union[
            schemas.INPUT_TYPES_ALL,
            schemas.OUTPUT_BASE_TYPES
        ],
    ],
    typing.Tuple[
        typing.Union[
            schemas.INPUT_TYPES_ALL,
            schemas.OUTPUT_BASE_TYPES
        ],
        ...
    ]
]

@dataclasses.dataclass(frozen=True)
class Attributes(
    schemas.Schema[schemas.immutabledict, AttributesTuple]
):
    types: typing.FrozenSet[typing.Type] = frozenset({tuple})
    items: typing.Type[classification_attribute.ClassificationAttribute] = dataclasses.field(default_factory=lambda: classification_attribute.ClassificationAttribute) # type: ignore
    type_to_output_cls: typing.Mapping[
        typing.Type,
        typing.Type
    ] = dataclasses.field(
        default_factory=lambda: {
            tuple: AttributesTuple
        }
    )

The AttributesTuple should have the type information ClassificationAttribute or at least both of the possible types.

openapi-json-schema-generator version

Version 4.5.1

OpenAPI declaration file content or url
Generation Details

Used the cli command to generate the client.

Steps to reproduce
Related issues/PRs
Suggest a fix
spacether commented 3 months ago

The code generation is functioning as expected here. The anyOf refs are not known as possible types. For them to be known they would have to be examined at one higher schema level and compared against type info. Currently the generation only knows about input and output types for array and object, types defined in type, and ref only. Because anyOf is not analyzed for type info, the only info comes from absent type info which allows any type.

If you want different functionality here, please consider filing a PR with thorough test cases.

spacether commented 3 months ago

Also, should your anyOf be a oneOf so it matches only one of those schemas rather than one or more?

Marcelo00 commented 3 months ago

Thanks for the information. Is it possible to have a short meeting or could you point out the necessary steps for such functionality? We would like to invest some time to add such feature as our use case prefers to work with these classes and it would be nice to have proper type hints. However, we cannot really judge how long it may take.

I would also assume that this behavior also occurs with other modifiers like oneOf. Is it correct?

Also, should your anyOf be a oneOf so it matches only one of those schemas rather than one or more?

You are right. Our openAPI.yaml is also auto-generated which sometime causes semantically wrong specification like this.

spacether commented 3 months ago

Adding this is non-trivial. In the latest version of the generators, returned instances only subclass one class. So to do this one would need to:

Doing this is difficult because any number of keywords can exist in the same schema including:

So a new CodegenSchema field that could store this may look like typeToInterfaces: Map<String, List> Interfaces can come from refed locations so one would need to store the ref location and the name of the generated interface class. Some examples: typeToInterfaces = {"string": null, "boolean": null} # fora a schema that allows strings and numbers typeToInterfaces = {"object": [{SingleValueClassificationAttributeInterface, MultiValueClassificationAttributeInterface}]} # for your use case with oneOf two object types This property would need to be added to CodegenSchema.

When are you free to discuss this this week and next week?

Marcelo00 commented 3 months ago

Hey, I am free this week. How should we proceed? I can write you a mail to figure out a date. A coworker would potentially join the meeting.

spacether commented 3 months ago

A simpler solution here is to handle this specific case only where a schema only has oneOf defintion and no other json schema keywords. In this case, you could modify the Schema.validate method to do the following:

Another solution here is to use Speakeasey's code generation: https://github.com/speakeasy-api/speakeasy which I believe handles oneOfs in the way that you want

spacether commented 2 months ago

@Marcelo00 how did you decide to proceed forward here? It has been ~ 1 month since we met about this.

spacether commented 2 months ago

If I don't hear back here I will close this issue as inactive.

Marcelo00 commented 2 months ago

Hey, I am sorry for not coming back to you. We tried out your suggestion of using speakeasy and it turns out that I fulfills our requirements. Thanks again for your time investing in this project and your suggestion. Really appreciate your help.