tiangolo / fastapi

FastAPI framework, high performance, easy to learn, fast to code, ready for production
https://fastapi.tiangolo.com/
MIT License
72.12k stars 6.06k forks source link

[BUG] Using Nested Pydantic models and `params: MyModel = Depends()` forces OpenAPI docs GET methods to require a request body. #11037

Open TrevorBenson opened 4 months ago

TrevorBenson commented 4 months ago

First check

Example

Here's a self-contained minimal, reproducible, example with my use case:

from fastapi import APIRouter, Depends, FastAPI, Query
from fastapi.responses import JSONResponse
from typing import List, Optional
from pydantic import BaseModel, Field
import uvicorn

class MyModelData1(BaseModel):
    archive: Optional[str] = Field(None, description="Archive name")
    archive_type: Optional[str] = Field(None, description="Archive type")

class MetadataGet(BaseModel):
    id: Optional[str] = Field(None, alias="_id")
    foreign_key: Optional[str] = Field(None, alias="_foreign_key")

class DeepNestedModelGet(BaseModel):
    name: Optional[str]
    version: Optional[str]

class DetailsModelGet(BaseModel):
    some_data: Optional[List[Optional[DeepNestedModelGet]]]
    some_data2: Optional[List[Optional[str]]]

class MyModelData2(BaseModel):
    id: Optional[str] = Field(None, alias="_id")
    details: Optional[DetailsModelGet]
    meta: Optional[MetadataGet]

def get_documents(collection, sort_by, sort_order, page, page_size, **additional_filters):
    return {
        "collection": collection,
    }

app = FastAPI()
router1 = APIRouter(prefix="/data1", tags=["Data1"])
router2 = APIRouter(prefix="/data2", tags=["Data2"])

@router1.get("/", description="Retrieve all documents.")
def get_data1(
    sort_by: str = Query(None, description="Sort by this field"),
    sort_order: str = Query(None, description="Sort order"),
    page: int = Query(1, description="Page number"),
    page_size: int = Query(100, description="Number of documents per page"),
    params: MyModelData1 = Depends(),
):
    document = get_documents(
        "data1",
        sort_by,
        sort_order,
        page,
        page_size,
        **params.model_dump()
    )
    return JSONResponse(content=document)

@router2.get("/")
def get_data2(
    sort_by: str = Query(None, description="Sort by this field"),
    sort_order: str = Query(None, description="Sort order"),
    page: int = Query(1, description="Page number"),
    page_size: int = Query(100, description="Number of documents per page"),
    params: MyModelData2 = Depends(),
):
    """Get all software."""
    document = get_documents(
        "data2",
        sort_by,
        sort_order,
        page,
        page_size,
        **params.model_dump(),
    )
    return JSONResponse(content=document)

app.include_router(router1)
app.include_router(router2)

if __name__ == "__main__":
    uvicorn.run(app, host="127.0.0.1", port=5000)

Description

The solution you would like

The request body is not a required part of the GET /data2/ UI documentation when using an identical Depends() structure so dump_model() method can be used to pass very large models to the documentation as query parameters for the GET method which are optional.

Environment

Additional context

I've gone so far as to take List[str] which became Optional[List[str] in the Get version of the model, to an extreme of Optional[List[Optional[str]]] and List[DetailsModelGet] which was already all optional fields into Optional[List[Optional[DetailsModelGet]]]. Pushing Optional down to the last type and making "deeply optional" fields in case somehow this would resolve the issue. I cannot seem to find any instance of how to get the nested models not to result in a required request body in the docs GET method except to remove the nested (optional) models entirely and use a single-layer Pydantic model.

Originally posted by @stevesuh in https://github.com/tiangolo/fastapi/discussions/7275

juampivallejo commented 3 months ago

I faced a similar issue, the only way I found to make it work in the way you described was to explicitly pass a None default for the Optional nested model. Like this:

class MyModelData1(BaseModel):
    archive: Optional[str] = Field(None, description="Archive name")
    archive_type: Optional[str] = Field(None, description="Archive type")

class MetadataGet(BaseModel):
    id: Optional[str] = Field(None, alias="_id")
    foreign_key: Optional[str] = Field(None, alias="_foreign_key")

class DeepNestedModelGet(BaseModel):
    name: Optional[str] = None
    version: Optional[str] = None

class DetailsModelGet(BaseModel):
    some_data: Optional[List[DeepNestedModelGet]] = None
    some_data2: Optional[List[Optional[str]]] = None

class MyModelData2(BaseModel):
    id: Optional[str] = Field(None, alias="_id")
    details: DetailsModelGet = Depends()
    meta: MetadataGet = Depends()

In this case, the request body is not required for the GET call.

TrevorBenson commented 3 months ago

I stumbled onto this (or something very similar) the night I first posted this. However, I didn't update the issue because

  1. While the body was no longer required, the docs then contained 2 Required fields, args and kwargs. Screenshot from 2024-02-07 13-15-00
  2. The Try it out option seems to be broken, as clicking Execute does not return a TypeError, however no response is received and a LOADING is constantly spinning, even after clicking the cancel button to try and back out. Screenshot from 2024-02-07 13-16-25

The args and kwargs instead of actual params for the nested models (preferred) would be acceptable if not required or breaking the functionality for testing the API. I would prefer some form of the dump_model() defining the params as optional since that is how they are listed. I suppose not documenting nested models as queryable params may be an intentional choice. :shrug:

sheluchin commented 3 months ago

I believe using Pydantic models for GET params is not officially supported, and whatever of it works is only the case by accident. Using nested Pydantic models to define GET params is particularly problematic; there are many mentions of this if you search the issues and discussions. See this comment by @tiangolo.

The roadmap has plans for official support:

Support for Pydantic models for Query(), Form(), etc.

https://github.com/tiangolo/fastapi/discussions/10556#discussioncomment-7429554 https://github.com/tiangolo/fastapi/issues/318#issuecomment-584087181

TrevorBenson commented 3 months ago

I believe using Pydantic models for GET params is not officially supported, and whatever of it works is only the case by accident. Using nested Pydantic models to define GET params is particularly problematic; there are many mentions of this if you search the issues and discussions. See this comment by @tiangolo.

The roadmap has plans for official support:

Support for Pydantic models for Query(), Form(), etc.

#10556 (comment) #318 (comment)

That does appear to be the case. Thank you for the links, I had overlooked the comment in the closed discussion when it was marked answered w/ Pydantic v2 support, as well as the Roadmap as it did not explicitly mention Depends() for get methods or the resulting request body.

sheluchin commented 3 months ago

Since this comes up so often (judging by the number of issues and discussions), maybe it would be good to explicitly mention in the Query Parameters docs that using Pydantic models is not yet supported?

sonnix85 commented 3 months ago

Hi, if you want to store data in a class, consider using dataclasses with query parameters. For the body, if there are only optional fields and any data will be sent, you should use Depends().

j-carson commented 2 weeks ago

@TrevorBenson

Here is a way to think about what you are trying to do in your code and what the workaround might look like: It is important to understand that you are not specifying a pydantic model as an HTTP query parameter here: Your backend function is receiving a Pydantic model as an argument that was generated via a dependency injection.

It's important to keep the concepts of "query parameters" and "results of a dependency injection function" separate in your head: Query parameters at the HTTP level can only be types allowed by OpenAPI which is the specification underneath FastAPI. Dependency injection, on the other hand, can create whatever complicated object you want.

What's happening here is the default dependency injection function is not doing what you expected for deeply nested models. What is that dependency function doing? It's basically just looking at the type hint and seeing what fields the constructor needs and pulling them out of the HTTP query parameters. But, those query args that the constructor is receiving can only be types that the OpenAPI standard allows! So, you can't really send a Pydantic model in easily that way.

I would actually say that your "working" endpoint 1 is only working partially - If you look at the Swagger docs, you didn't get the description strings you specified for archive_name and archive_type, for example.
FastAPI is working as you expected, but the Swagger UI (and any application trying to use your openapi.json schema info directly) doesn't have the quite same picture as the one expressed in your code.

You can correct this by writing your own dependency injection function, rather than using what you get with a bare Depends(), which is what I'm going to do below. One of the key things to keep in mind when writing your own dependency injection: FastAPI figures things out by looking at type hints, but Swagger/OpenAPI figures things out by looking at the JSON schema of your API, which is generated by FastAPI using Pydantic, the glue between the highest and lowest levels. Basically, I'll be explicitly gluing things together by grabbing JSON schema for Swagger and arguments for FastAPI functions by pulling info out of your Pydantic classes.

Here's a handy glue function: This function takes info out of a particular field in a a Pydantic class object and and populates an arguments dictionary that we're going to want to pass to FastAPI's Query() API to make dependency magic happen.

def query_kwargs(
    class_: type[BaseModel],  # The Pydantic class we're making in the dependency injection
    field: str                # The name of the field we're trying to pull out of the query string
):              
    field_info =  class_.model_fields[field]
    is_required = field_info.is_required()
    description = field_info.description
    default = field_info.get_default()

    result = {
        'default': default,
        'required': is_required,
        'description': description,
    }

    if alias := field_info.serialization_alias:
        json_schema = class_.model_json_schema()['properties'][alias]
    else:
        json_schema = class_.model_json_schema()['properties'][field]

    result['json_schema_extra'] = json_schema
    return result

(The above will at least do the trick for basic field types, not fields that are nested models. If you run MyModelData2.model_json_schema() you'll see that it has a whole set of sub-definitions and other things that I'm not even checking for.)

Here is your simple class with a custom dependency function:

class MyModelData1(BaseModel):
    archive: Optional[str] = Field(None, description="Archive name")
    archive_type: Optional[str] = Field(None, description="Archive type")

    @classmethod
    def from_query(cls):
        """Class method to build an instance from the HTTP query string"""

        # Use my handy function above to generate the args to Query
        q_archive = query_kwargs(cls, 'archive')
        q_archive_type = query_kwargs(cls, 'archive_type')

        # Create a function that parses info out of the query explicitly, rather
        # that implicitly
        def dependency_to_inject(
            # Be sure these type hints match the type annotation in the class
            # field definitions or FastAPI could become confused in some cases!
            archive: Optional[str] = Query(**q_archive),
            archive_type: Optional[str] = Query(**q_archive_type),
        ) -> Self:
            return cls(archive=archive, archive_type=archive_type)
        return Depends(dependency_to_inject)

And then swap in this new function for your bare Depends() as below. (I also changed the response to echo back the params object that was built to make it easier to see what's happening.

@router1.get("/", description="Retrieve all documents.")
def get_data1(
    sort_by: str = Query(None, description="Sort by this field"),
    sort_order: str = Query(None, description="Sort order"),
    page: int = Query(1, description="Page number"),
    page_size: int = Query(100, description="Number of documents per page"),
    params: MyModelData1 = MyModelData1.from_query()  #  <---- Call the new function here!
):
    # Change the response to return the params as decoded to see what happened!
    return JSONResponse(content=params.model_dump())

Go run this new version and play with the MyModelData1 field definitions a little - update the description, default, type, etc. and see that the Swagger docs are now responding to those changes, type errors are caught, etc.

Now, let's move on to your deeply nested example.

The first thing I wanted to know when I tried to build a dependency function for this version was: What do you want your query-string to look like in when sending this request? Do you just want all the leaf-level field names to be piled in there, or do you want the deeply nested models to be passed as stringified-something (for example, to avoid problems like the name clash on the "id" field in MyModelData2 and MetaDataGet)?

That's a domain-specific answer, and I basically just punted on it for the code below. I said, OK, the "id" in the MyModelData2 object and the "id" in the MetaDataGet were just going to be the same value pulled from the query, since they have the same name, alias, and type. I also wasn't quite sure what do to with the array of DeepNestedModelGet objects -- I just assumed the user is required to send the same number of "name" and "version" files to make each element of the array have enough data and didn't even error check it.

Even with those simplifications, the dependency function is quite a bit more complicated: I need to pull fields out of the query and build the model tree from the bottom up:

class MyModelData2(BaseModel):
    id: Optional[str] = Field(None, alias="_id")
    details: Optional[DetailsModelGet]
    meta: Optional[MetadataGet]

    @classmethod
    def from_query(cls):
        """Class method to build an instance from a query string"""
        q_id = query_kwargs(cls, "id")

        # Plain fields of member 'details'
        q_some_data2 = query_kwargs(DetailsModelGet, "some_data2")

        # Plain fields of member 'details.some_data' -- except! We actually
        # want a list of these objects in the next level up so patch the
        # json schema up to make these arrays that can be in query string 
        # more than once

        q_name = query_kwargs(DeepNestedModelGet, "name")
        q_version = query_kwargs(DeepNestedModelGet, "version")

        q_name['json_schema_extra'] = {
            'type': 'array',
            'items': q_name['json_schema_extra']
        }
        q_version['json_schema_extra'] = {
            'type': 'array',
            'items': q_version['json_schema_extra']
        }

        # Plain fields of member "meta"
        # Note: id field name is already taken for this query so ???
        q_foreign_key = query_kwargs(MetadataGet, "foreign_key")

        def dependency_to_inject(
            id: Optional[str] = Query(**q_id),
            some_data2: Optional[list[Optional[str]]]  = Query(**q_some_data2),
            # These are now lists of the original field type
            name: Optional[list[Optional[str]]] = Query(**q_name),
            version: Optional[list[Optional[str]]] = Query(**q_version),
            foreign_key: Optional[str] = Query(**q_foreign_key),
        ):
            if name is not None and version is not None:
                some_data = [
                    DeepNestedModelGet(
                        name=n,
                        version=v
                    )
                    for n,v in zip(name, version)
                ]
            else:
                some_data = None

            result =  cls(
                _id=id,
                details = DetailsModelGet(
                    some_data =some_data,
                    some_data2 = some_data2
                ),
                meta = MetadataGet(
                    _id = id,
                    _foreign_key = foreign_key,
                )
            )
            return result
        return Depends(dependency_to_inject)

Now, change the definition of get_data2 to mimic what we did in get_data1 -- use the new .from_query() method instead of Depends() and change the response to dump the params argument to see what you got. If you open the new version in your Swagger docs, you will see a form with all the fields you need to create at least some version of your deeply nested model, and if you hit submit you'll see the JSON that the dependency injection calculated.

But, I hope you also see drawbacks of the above code:

I made a bunch of assumptions about how you wanted the HTTP query string formatted to even begin to write this function (FastAPI usually hides that). I just used the leaf-level field names as the query field names. Something like having query parameters details__some_data2 that encapsulate the model tree path might prevent name clashes, but it also exposes class implementation details you don't necessarily want the end user of your API to have to worry about.
Figuring out unique names for all the parts of a deeply nested tree would not be easy on the programmer either. Or, passing "details" as a single query parameter means you have encoded parameters that need a second level of decoding to find fields inside the sub-model, adding complexity for both the caller and the backend. There's basically no simple surefire way to define this endpoint.

I don't have good separation of concerns between the various class definitions in part 2 - the Data2 decoder has intimate knowledge of the fields of the classes it encloses and will have to be maintained carefully.
Changing the contents any of the above classes would quickly mess things up.

This code required full-stack understanding of OpenAPI, Pydantic, and FastAPI to get to this still half-baked result.
No joy of letting FastAPI take care of the details here, but to understand why this feature might not come, if ever: How could those details easily get taken care of by a library? Even if FastAPI could support deeply nested models, it probably couldn't support them magically under the covers with the code as you first wrote it. It would probably require at least some new Annotated[] directives to tell the system how you want it to behave at various important decision points, or it would have to have a lot of restrictions and throw runtime errors whenever things start getting tricky.