BeanieODM / beanie

Asynchronous Python ODM for MongoDB
http://beanie-odm.dev/
Apache License 2.0
2.09k stars 219 forks source link

[BUG] Projection model not working in aggregation #1017

Open valentinoli opened 2 months ago

valentinoli commented 2 months ago

Describe the bug

When I provide a projection_model to FindMany.aggregate(), that has the id property with an alias="_id" it fails to project with the following error

_id
  Field required [type=missing, input_value={[**REDACTED**]}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.8/v/missing

This happens even if I set populate_by_name=True on the Pydantic model.

To Reproduce

from beanie import DocumentModel
from pydantic import BaseModel

class Doc(DocumentModel):
    id: str = Field(
        ...,
        alias="_id",
    )

class Model(BaseModel):
    model_config = ConfigDict(
        populate_by_name=True,
    )

    id: str = Field(
        ...,
        alias="_id",
    )

async def query():
    results = await Doc.find().aggregate(aggregation_pipeline=[], projection_model=Model).to_list()
    return results

Expected behavior This should work. I should not have to manually project each result from the query, like [Model(**res) for res in results]

github-actions[bot] commented 1 month ago

This issue is stale because it has been open 30 days with no activity.

staticxterm commented 1 month ago

Hi, I am unable to reproduce this on Python3.13, Beanie 1.27.0 (or even 1.26.0) and Pydantic 2.9.2 (or 1.10.18). Code

import asyncio

from beanie import Document, init_beanie
from motor.motor_asyncio import AsyncIOMotorClient
from pydantic import BaseModel, ConfigDict, Field

class Doc(Document):
    id: str = Field(
        ...,
        alias="_id",
    )
    field_a: str
    field_b: str

class Model(BaseModel):
    model_config = ConfigDict(
        populate_by_name=True,
    )

    id: str = Field(
        ...,
        alias="_id",
    )
    field_b: str

async def main():
    client = AsyncIOMotorClient("mongodb://localhost:27017")
    database = client["test-db"]

    await init_beanie(database, document_models=[Doc])

    # Run DB queries now.
    doc = Doc(id="1", field_a="a", field_b="b")
    result = await doc.save()
    print(result)

    results = (
        await Doc.find()
        .aggregate(aggregation_pipeline=[], projection_model=Model)
        .to_list()
    )
    print(results)

if __name__ == "__main__":
    asyncio.run(main())

Output:

id='1' revision_id=None field_a='a' field_b='b'
[Model(id='1', field_b='b')]
mg3146 commented 1 month ago

I feel like I had this issue once when using pydantic v1, seems slightly familiar... don't quote me on that though

valentinoli commented 1 month ago

Hey, thanks for the response. I will try to provide a better reproduction.

valentinoli commented 1 month ago

It's bit of a "weird" case, but below is the full reproduction.

Here is the document example:

{
  "_id": "my_id",
  "field_list": [
    {
      "id": "id_1",
      "field_a": "a",
      "field_b": "b"
    },
    {
      "id": "id_2",
      "field_a": "a",
      "field_b": "b"
    }
  ]
}

The aim is to get as result the nested object (projected to exclude field_b):

{
  "id": "id_1",
  "field_a": "a"
}

Here is the code to reproduce. Notice how the ProjectionModel.id has alias="_id" (it just does, don't ask why) and populate_by_name=True. So the below should work, but it doesn't.

import asyncio

from beanie import Document, init_beanie
from motor.motor_asyncio import AsyncIOMotorClient
from pydantic import BaseModel, Field, ConfigDict

class ProjectionModel(BaseModel):
    model_config = ConfigDict(
        populate_by_name=True,
    )

    id: str = Field(
        ...,
        alias="_id",
    )
    field_a: str

class Model(BaseModel):
    id: str
    field_a: str
    field_b: str

class Doc(Document):
    id: str = Field(
        ...,
        alias="_id",
    )
    field_list: list[Model]

async def query(list_item_id: str):
    results = (
        await Doc.find()
        .aggregate(
            aggregation_pipeline=[
                {
                    "$unwind": "$field_list",
                },
                {
                    "$match": {
                        "field_list.id": list_item_id,
                    },
                },
                {
                    "$replaceRoot": {
                        "newRoot": "$field_list",
                    }
                },
            ],
            projection_model=ProjectionModel,
        )
        .to_list()
    )
    return results

async def main():
    client = AsyncIOMotorClient("mongodb://localhost:27017")

    await init_beanie(
        database=client.test_db,
        document_models=[Doc],
    )

    # Run DB queries now.
    doc = Doc(
        id="my_id",
        field_list=[
            Model(
                id="id_1",
                field_a="a",
                field_b="b",
            ),
            Model(
                id="id_2",
                field_a="a",
                field_b="b",
            ),
        ],
    )
    result = await doc.save()
    print(result)

    try:
        results = await query(list_item_id="id_1")
        print(results)
    finally:
        # pass
        await doc.delete()

if __name__ == "__main__":
    asyncio.run(main())
github-actions[bot] commented 2 days ago

This issue is stale because it has been open 30 days with no activity.