koxudaxi / datamodel-code-generator

Pydantic model and dataclasses.dataclass generator for easy conversion of JSON, OpenAPI, JSON Schema, and YAML data sources.
https://koxudaxi.github.io/datamodel-code-generator/
MIT License
2.63k stars 294 forks source link

Alias mapping creates Models and Fields with same name #1626

Open YYYasin19 opened 11 months ago

YYYasin19 commented 11 months ago

Describe the bug I have used datamodel-code-generator with a custom mapping to create a Pydantic v2 model. Doing that resulted in field names having the same name as other models which messes up validation for Optional fields. I believe the bug (?) is, independent of specific settings, present for all cases where one might use optional fields and a mapping.

To Reproduce The JSON Schema roughly looks like this

{
  "title": "title1",
  "type": "object",
  "properties": {
    "F00000001173": {
      "title": "title2",
      "type": "string",
      "enum": ["001", "002", "003"]
    },
    "G00000001490": {
      "title": "title3",
      "type": "object",
      "properties": {
        "G00000001477": {
          "title": "title4",
          "type": "object",
          "properties": {},
          "required": ["F60000000227", "G60000000083", "G60000000093"]
        },
        "F00000002548": {
          "title": "title5",
          "type": "number",
          "multipleOf": 0.01,
          "minimum": 0
        },
        "F60000000296": {
          "type": "array",
          "items": {
            "title": "title6",
            "type": "string"
          },
          "minItems": 0
        }
      },
      "required": ["G00000001477", "F00000002548", "F60000000296"]
    }
  },
  "required": ["F00000001173"]
}

and a mapping between the ID fields and some more descriptive names was used (to make them more readable).

{
  "F00000002548": "field1",
  "G00000001490": "Model2",
  "F00000001173": "field3",
  "F60000000296": "field4"
}

This resulted in a code that looks like this

class Model2(BaseModel):
    model_config = ConfigDict(
        extra="allow",
        populate_by_name=True,
    )
    field1: Annotated[float, Field(alias="F00000002548", ge=0.0, multiple_of=0.01)]
    field2: Annotated[List[str], Field(alias="F60000000296", min_length=0)]

class Model1(BaseModel):
    model_config = ConfigDict(
        extra="allow",
        populate_by_name=True,
    )
    field1: Annotated[Literal["001", "002", "003"], Field(alias="F00000001173")]
    Model2: Annotated[Optional[Model2], Field(alias="G00000001490")] = None

where the problem is that the field should not be called Model2 like the class is.

Used commandline:

$ datamodel-codegen --input schema.json --target-python-version 3.10 --use-annotated --field-constraints --enum-field-as-literal all --allow-population-by-field-name --allow-population-by-field-name --output-model-type pydantic_v2.BaseModel --output schema.py

Expected behavior The issue is fixed by making sure that the field names aren't the same as the other model names. I achieved this temporarily by modifying the .jinja2 template like this

{%- for field in fields -%}
    {%- if not field.annotated and field.field %}
    {{ field.name | lower }}: {{ field.type_hint }} = {{ field.field }}
    {%- else %}
    {%- if field.annotated %}
    {{ field.name | lower }}: {{ field.annotated }}
    {%- else %}
    {{ field.name | lower }}: {{ field.type_hint }}
    {%- endif %}
    {%- if not field.required or field.data_type.is_optional or field.nullable
            %} = {{ field.represented_default }}
    {%- endif -%}
    {%- endif %}
    {%- if field.docstring %}
    """
    {{ field.docstring | indent(4) }}
    """
    {%- endif %}

Version:

Additional context Is there a way to modify this consistently? Passing the custom_template_dir argument didn't seem to work for me. If this is indeed a bug, I'm happy to provide a PR if someone can tell me where the field names are created! :)

koxudaxi commented 10 months ago

@YYYasin19 I'm sorry for my late reply.

and a mapping between the ID fields and some more descriptive names was used (to make them more readable).

How do you use the mapping data when generating a model? I can't see the logic to inject it.

YYYasin19 commented 10 months ago

Hi, I just pass the --aliases mapping.json option to the CLI interface or a dict to the Python interface.

YYYasin19 commented 10 months ago

I'm not sure if I'm using this incorrectly, since the bug should be very general, i.e. everyone using an alias mapping should be affected by it, right?

nwithan8 commented 5 months ago

I am affected by this, datamodel-code-generator is generating Models matching the same name/case as field names, which is causing Pydantic validation issues when their type contraints are Optional[T] = None. I am not using any annotations or aliases.

cpnat commented 1 month ago

I have the same issues as described by @nwithan8 above; although haven't looked into the code deeper.

It is essentially an issue with name collisions between field and type annotation, as described here: https://docs.pydantic.dev/2.8/errors/validation_errors/#none_required

JasP19 commented 3 days ago

Hi @koxudaxi. We're also experiencing the same issue as @nwithan8. In our OpenAPI spec, if a nested object is named with a capital letter e.g. Foo then the generator creates a class called Foo and references it in the base model as Foo: Optional[Foo] = None - this triggers the Pydantic Input should be None error as mentioned previously and linked in the previous comment by @cpnat.

Is there any simple way to solve or work around this? We can't exactly change the naming of the fields in the OpenAPI spec. One possibility I can think of is if there was a way to add a custom prefix/suffix to all generated classes to avoid collision with the field name.