aws-powertools / powertools-lambda-python

A developer toolkit to implement Serverless best practices and increase developer velocity.
https://docs.powertools.aws.dev/lambda/python/latest/
MIT No Attribution
2.72k stars 376 forks source link

Cannot use models generated by `datamodel-codegen` from OpenAPI 3.0 spec with property-level `example:`s #4476

Open nlykkei opened 2 weeks ago

nlykkei commented 2 weeks ago

Expected Behaviour

I would like to use models generated by datamodel-codegen from OpenAPI 3.0 spec with property-level example:s

Current Behaviour

The following components.schemas.LddProjectsList:

LddProjectsList:
  title: LddProjectsList
  type: object
  properties:
    pagination:
      type: object
      properties:
        next:
          type: string
          description: The next page uri
          format: uri
          example: http://www.example.com/some/path?cursor=SW0gYW4gb3BhcXVlIGN1cnNvcg==
        current:
          type: integer
          description: The current page number
        total:
          description: The total number of pages
          type: integer
      required:
        - next
        - current
        - total
    projects:
      type: array
      items:
        $ref: "#/components/schemas/LddProject"
    cursor:
      type: string
  required:
    - pagination
    - projects

Results in the following model being output by datamodel-codegen:

class Pagination(BaseModel):
    model_config = ConfigDict(
        populate_by_name=True,
    )
    next: AnyUrl = Field(..., examples=["http://www.example.com/some/path?cursor=SW0gYW4gb3BhcXVlIGN1cnNvcg=="])
    """
    The next page uri
    """
    current: int
    """
    The current page number
    """
    total: int
    """
    The total number of pages
    """

When using the Pagination model with AWS Lambda Powertools REST API Event Handler, as part of Response[T], the following error is generated when attempting to generate OpenAPI 3.0 spec, as it expects either a valid dict or Example:

❯ poetry run python src/lego/bff/application/lambda_handlers/api_gateway_proxy_handler.py
/Users/dknilyiv/projects/lddprobff/lego/.venv/lib/python3.12/site-packages/pydantic/_internal/_fields.py:160: UserWarning: Field "model_bags" has conflict with protected namespace "model_".

You may be able to resolve this warning by setting `model_config['protected_namespaces'] = ()`.
  warnings.warn(
/Users/dknilyiv/projects/lddprobff/lego/.venv/lib/python3.12/site-packages/aws_lambda_powertools/event_handler/api_gateway.py:1550: UserWarning: You are using Pydantic v2, which is incompatible with OpenAPI schema 3.0. Forcing OpenAPI 3.1
  openapi_version = self._determine_openapi_version(openapi_version)
Traceback (most recent call last):
  File "/Users/dknilyiv/projects/lddprobff/lego/src/lego/bff/application/lambda_handlers/api_gateway_proxy_handler.py", line 874, in <module>
    app.get_openapi_json_schema(
  File "/Users/dknilyiv/projects/lddprobff/lego/.venv/lib/python3.12/site-packages/aws_lambda_powertools/event_handler/api_gateway.py", line 1717, in get_openapi_json_schema
    self.get_openapi_schema(
  File "/Users/dknilyiv/projects/lddprobff/lego/.venv/lib/python3.12/site-packages/aws_lambda_powertools/event_handler/api_gateway.py", line 1617, in get_openapi_schema
    return OpenAPI(**output)
           ^^^^^^^^^^^^^^^^^
  File "/Users/dknilyiv/projects/lddprobff/lego/.venv/lib/python3.12/site-packages/pydantic/main.py", line 176, in __init__
    self.__pydantic_validator__.validate_python(data, self_instance=self)
pydantic_core._pydantic_core.ValidationError: 3 validation errors for OpenAPI
components.schemas.Pagination.Schema.properties.next.Schema.examples.0
  Input should be a valid dictionary or instance of Example [type=model_type, input_value='http://www.example.com/s...W4gb3BhcXVlIGN1cnNvcg==', input_type=str]
    For further information visit https://errors.pydantic.dev/2.7/v/model_type
components.schemas.Pagination.Schema.properties.next.bool
  Input should be a valid boolean [type=bool_type, input_value={'examples': ['http://www...Next', 'type': 'string'}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.7/v/bool_type
components.schemas.Pagination.Reference.$ref
  Field required [type=missing, input_value={'properties': {'next': {...tion', 'type': 'object'}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.7/v/missing

Code snippet

Possible Solution

No response

Steps to Reproduce

See current behavior

Powertools for AWS Lambda (Python) version

latest

AWS Lambda function runtime

3.12

Packaging format used

PyPi

Debugging logs

No response

heitorlessa commented 2 weeks ago

hey @nlykkei , thank you for reporting it - got any code snippet to make it easier to reproduce it?

Edit: brew latest upgrade broke openssl on my machine so I'll take longer to debug.

heitorlessa commented 2 weeks ago

@leandrodamascena are you able to look into it tomorrow? had a rough day today and w/ development setup suddenly broken it was hard to concentrate. It almost looks like either input or a type we can't deserialize (AnyUrl?).

leandrodamascena commented 2 weeks ago

Hey @nlykkei! Thanks for reporting this situation.

In your components.schemas.LddProjectsList, you are using a single example entry: example: http://www.example.com/some/path?cursor=SW0gYW4gb3BhcXVlIGN1cnNvcg==. However, in your model, you are using multiple entries for the example field when you use the examples field. To achieve what you're looking for, your code should be something like this:

from aws_lambda_powertools.event_handler import APIGatewayRestResolver
from pydantic import AnyUrl, BaseModel, ConfigDict, Field

app = APIGatewayRestResolver(enable_validation=True)

class Pagination(BaseModel):
    model_config = ConfigDict(
        populate_by_name=True,
    )
    next: AnyUrl = Field(..., example="http://www.example.com/some/path?cursor=SW0gYW4gb3BhcXVlIGN1cnNvcg==")
    """
    The next page uri
    """
    current: int
    """
    The current page number
    """
    total: int
    """
    The total number of pages
    """

@app.get("/hello")
def hello() -> Pagination:
    ...

def lambda_handler(event, context):
    print(app.get_openapi_json_schema())
    app.resolve(event, context)

Perhaps you genuinely desire to utilize the examples field at the component level. However, you mentioned that you're employing OpenAPI 3.0, and while it is indeed feasible to define examples in OpenAPI 3.0, you may encounter difficulties in validating them within Swagger, as Swagger only accepts examples at the component level from version 3.1.0 onward.

image

Please let me know if you have any additional question.

Thanks

nlykkei commented 2 weeks ago

@leandrodamascena Thank you for such swift response!

However, while Pydantic's datamodel-codegen generates invalid models from its input OpenAPI 3.0.x spec, that shouldn't be an issue for AWS Lambda Powertools?

After all, AWS Lambda Powertools generates OpenAPI 3.1.x, so the examples array shouldn't be an issue?

class Pagination(BaseModel):
    next: AnyUrl = Field(..., description='The next page uri', examples=['http://www.example.com/some/path?cursor=SW0gYW4gb3BhcXVlIGN1cnNvcg=='])
    current: int = Field(..., description='The current page number')
    total: int = Field(..., description='The total number of pages')

Consider the following example:

OpenAPI 3.0.x:

openapi: 3.0.3
info:
  title: Example API
  version: latest
servers:
  - url: https://example.com

paths:
  /swagger:
    get:
      operationId: getSwagger
      description: Returns Swagger UI
      responses:
        "200":
          description: Successful response
          content:
            text/html: {}
      security:
        - api_key: []

components:
  schemas:
    LddProjectsList:
      title: LddProjectsList
      type: object
      properties:
        pagination:
          type: object
          properties:
            next:
              type: string
              description: The next page uri
              format: uri
              example: http://www.example.com/some/path?cursor=SW0gYW4gb3BhcXVlIGN1cnNvcg==
            current:
              type: integer
              description: The current page number
            total:
              description: The total number of pages
              type: integer
          required:
            - next
            - current
            - total
        cursor:
          type: string
      required:
        - pagination

  securitySchemes:
    api_key:
      type: apiKey
      name: x-api-key
      in: header

When I generate models from this using:

poetry run datamodel-codegen --target-python-version 3.12 --input-file-type openapi --output-model-type pydantic_v2.BaseModel --input <input> --output <output>

The following models are output:

# generated by datamodel-codegen:
#   filename:  lddbff-openapi1.yaml
#   timestamp: 2024-06-11T14:09:40+00:00

from __future__ import annotations

from typing import Optional

from pydantic import AnyUrl, BaseModel, Field

class Pagination(BaseModel):
    next: AnyUrl = Field(..., description='The next page uri', examples=['http://www.example.com/some/path?cursor=SW0gYW4gb3BhcXVlIGN1cnNvcg=='])
    current: int = Field(..., description='The current page number')
    total: int = Field(..., description='The total number of pages')

class LddProjectsList(BaseModel):
    pagination: Pagination
    cursor: Optional[str] = None

Which results in invalid OpenAPI 3.0.x models, but valid OpenAPI 3.1.x models?

Indeed, if I change examples=['http://www.example.com/some/path?cursor=SW0gYW4gb3BhcXVlIGN1cnNvcg=='] to example='http://www.example.com/some/path?cursor=SW0gYW4gb3BhcXVlIGN1cnNvcg==', then app.get_openapi_json_schema(...) generates the following, which is valid OpenAPI 3.0.x:

"Pagination": {
  "properties": {
    "next": {
      "type": "string",
      "minLength": 1,
      "format": "uri",
      "title": "Next",
      "example": "http://www.example.com/some/path?cursor=SW0gYW4gb3BhcXVlIGN1cnNvcg=="
    },
    "current": {
      "type": "integer",
      "title": "Current"
    },
    "total": {
      "type": "integer",
      "title": "Total"
    }
  },
  "type": "object",
  "required": [
    "next",
    "current",
    "total"
  ],
  "title": "Pagination"
}

But as we're using Pydantic 2.x, and AWS Lambda Powertools therefore enforces OpenAPI 3.1.x, it should have accepted the OpenAPI 3.1.x model with examples=['http://www.example.com/some/path?cursor=SW0gYW4gb3BhcXVlIGN1cnNvcg=='] to begin with?

leandrodamascena commented 2 weeks ago

Hi @nlykkei, thanks for such detailed response. Just to avoid back and forth and to have a clear understanding based on your last message, I assume now you are using Pydantic v2 + OpenAPI 3.1.x and not OpenAPI 3.0 and I understood this before, right?

If yes, I agree with you that we can fix this and improve the experience to allow defining examples with just a simple list of strings or a complex Example type. Currently, we enforce the use of a List of Example objects, but looking at the OpenAPI documentation, at the Schema level the examples parameter accepts a List[str | Example | Reference].

I made some local tests here and could generate the OpenAPI spec with this change.

app.py

from aws_lambda_powertools.event_handler import APIGatewayRestResolver
from pydantic import AnyUrl, BaseModel, ConfigDict, Field

app = APIGatewayRestResolver(enable_validation=True)

class Pagination(BaseModel):
    model_config = ConfigDict(
        populate_by_name=True,
    )
    next: AnyUrl = Field(..., examples=["http://www.example.com/some/path?cursor=SW0gYW4gb3BhcXVlIGN1cnNvcg=="])
    """
    The next page uri
    """
    current: int
    """
    The current page number
    """
    total: int
    """
    The total number of pages
    """

@app.get("/hello")
def hello() -> Pagination:
    ...

def lambda_handler(event, context):
    print(app.get_openapi_json_schema())
    app.resolve(event, context)

image

Want to submit a PR to fix this? We are planning to release a new version next Thursday (13/06) and may include this fix. Please let me know if you can't, so I can send the PR to fix this.

Thanks.

nlykkei commented 2 weeks ago

Hi @nlykkei, thanks for such detailed response. Just to avoid back and forth and to have a clear understanding based on your last message, I assume now you are using Pydantic v2 + OpenAPI 3.1.x and not OpenAPI 3.0 and I understood this before, right?

We're using Pydantic 2, which makes AWS Lambda Powertools generate OpenAPI 3.1.x for the Swagger UI, which happens automatically by app.enable_swagger(...).

As AWS Lambda Powertools doesn't support OpenAPI 3.0.x or extended attributes (X-), we're forced to also maintain an OpenAPI 3.0.x spec by hand. Over time, we hope that API Gateway would start supporting OpenAPI 3.1.x, and you would implement X- extended attributes for specifying Lambda proxy integrations.

In any case, virtually all Python-based LEGO teams use datamodel-codegen to generate models given an OpenAPI 3.0.x spec, hosted in our centralized API platform. If the generated models cannot be used with AWS Lambda Powertools without manual modifications, we have a problem :)

but looking at the OpenAPI documentation, at the Schema level the examples parameter accepts a List[str | Example | Reference].

Would you provide a link? :)

leandrodamascena commented 1 week ago

We will update this issue and next steps later today.

leandrodamascena commented 1 week ago

Rescheduling this for tomorrow due to some unexpected internal demands.