koxudaxi / datamodel-code-generator

Pydantic model and dataclasses.dataclass generator for easy conversion of JSON, OpenAPI, JSON Schema, and YAML data sources.
https://koxudaxi.github.io/datamodel-code-generator/
MIT License
2.65k stars 296 forks source link

PydanticV2 allows extra fields _without_ `--allow-extra-fields` option #1977

Open maximilian-tech opened 4 months ago

maximilian-tech commented 4 months ago

Describe the bug It should forbid extra_fields by default. PyDantic_v2 itself ignores extra_fields per default. So I expect datamodel-code-gen to explicitly set allow_extra_fields=False unless --allow-extra-fields is specified

To Reproduce

Example schema:

{
  "type": "object",
  "properties": {
    "name": {
      "type": "string"
    },
    "age": {
      "type": "integer",
      "minimum": 0
    },
    "email": {
      "type": "string",
      "format": "email"
    },
    "address": {
      "type": "object",
      "properties": {
        "street": {
          "type": "string"
        },
        "city": {
          "type": "string"
        },
        "zipcode": {
          "type": "string",
          "pattern": "^[0-9]{5}(?:-[0-9]{4})?$"
        }
      },
      "required": ["street", "city", "zipcode"]
    },
    "phoneNumbers": {
      "type": "array",
      "items": {
        "type": "string",
        "pattern": "^[0-9]{10}$"
      }
    }
  },
  "required": ["name", "age", "email", "address"]
}

Used commandline:

$ datamodel-codegen --input test_json.json --output test_json.py --output-model-type pydantic_v2.BaseModel

Expected behavior I want the following behavior by default

# generated by datamodel-codegen:
#   filename:  test_json.json
#   timestamp: 2024-05-29T06:01:45+00:00

from __future__ import annotations

from typing import List, Optional

from pydantic import BaseModel, ConfigDict, EmailStr, conint, constr

class Address(BaseModel):
    model_config = ConfigDict(
        extra='forbid',
    )
    street: str
    city: str
    zipcode: constr(pattern=r'^[0-9]{5}(?:-[0-9]{4})?$')

class Model(BaseModel):
    model_config = ConfigDict(
        extra='forbid',
    )
    name: str
    age: conint(ge=0)
    email: EmailStr
    address: Address
    phoneNumbers: Optional[List[constr(pattern=r'^[0-9]{10}$')]] = None

What I got is this:

# generated by datamodel-codegen:
#   filename:  test_json.json
#   timestamp: 2024-05-29T06:01:45+00:00

from __future__ import annotations

from typing import List, Optional

from pydantic import BaseModel, ConfigDict, EmailStr, conint, constr

class Address(BaseModel):
    street: str
    city: str
    zipcode: constr(pattern=r'^[0-9]{5}(?:-[0-9]{4})?$')

class Model(BaseModel):
    name: str
    age: conint(ge=0)
    email: EmailStr
    address: Address
    phoneNumbers: Optional[List[constr(pattern=r'^[0-9]{10}$')]] = None

Version:

Additional context Add any other context about the problem here.

fkapsahili commented 2 months ago

I encountered the same issue, but found out that setting the additionalProperties property to false in the JSON schema definitions results in a generated model with the extra attribute in the model_config set to "forbid":

model_config = ConfigDict(
    extra='forbid',
)
Emerentius commented 2 weeks ago

Both the JSON schema and openAPI definitions are extensible by default. See the openAPI spec:

additionalProperties - Value can be boolean or object. Inline or referenced schema MUST be of a Schema Object and not a standard JSON Schema. Consistent with JSON Schema, additionalProperties defaults to true.

Therefore datamodel-codegen's only viable options are to ignore or allow extra fields and it goes with the default of ignoring them. I'd rather see it default to "allow", but defaulting to "forbid" would just be wrong.

maximilian-tech commented 2 weeks ago

The ambiguity arises from the different layers, the JSON passes through when being processed here. Per default it does nothing/let's underlying layers decide what is what. In the case of pedanticV1 it once was the default to forbid extra fields, hence an --allow-extra-fields option was added. But PydanticV2 allows extra fields per default, now one would need a --forbid-extra-fields option. This plays a little bit the parsing vs. validating difference. But for me it would be ok, if the documentation would describe the implementation, and an additional flag would be introduced - keeping the status quo of not interfering with 'default' values.