litestar-org / polyfactory

Simple and powerful factories for mock data generation
https://polyfactory.litestar.dev/
MIT License
988 stars 78 forks source link

Enhancement: Support for dictionaries with Pydantic models as value, e.g. dict[str, PydanticClass] #512

Open ErikvdVen opened 5 months ago

ErikvdVen commented 5 months ago

Summary

As mentioned on the Discord channel, below code will result in a Pydantic validation error. If I turn households into lists (as well as in the defaults dict as in the HousHolds class) it all works!! But using dict[str, HouseHold] as typehint, does not work out.

As a maintainer g... pointed out: This is something we just don't handle in polyfactory currently. When given a dictionary for households, then it's assumed that the value you've given is what should be used. That means we end up not creating instances of HouseHold, but just pass in the raw dictionary and pydantic complains.

Currently we do support Sequence[SomeModel, but no other type is supported. I think supporting things like Mapping[str, SomeModel] is going to be somewhat complex though I'm not a 100% sure.

The code:

from pydantic import Field, BaseModel
from uuid import UUID, uuid4
from typing_extensions import TypedDict
from typing import Union
from datetime import date, datetime

from polyfactory.factories.pydantic_factory import ModelFactory
from polyfactory.factories import TypedDictFactory

class RelativeDict(TypedDict):
    household_id:str
    familymember_id:str

class FamilyMember(BaseModel):
    familymember_id: str
    name: str
    hobbies: list[str]
    age: Union[float, int]
    birthday: Union[datetime, date]
    relatives: list[RelativeDict]

class HouseHold(BaseModel):
    household_id: str
    name: str
    familymembers: list[FamilyMember]

class HouseHolds(BaseModel):
    id: UUID = Field(default_factory=uuid4)
    households: dict[str, HouseHold]

class RelativeDictFactory(TypedDictFactory[RelativeDict]):
    ...

class FamilyMemberFactory(ModelFactory[FamilyMember]):
    relatives = list[RelativeDictFactory]

class HouseHoldFactory(ModelFactory[HouseHold]):
    familymembers = list[FamilyMemberFactory]

class HouseHoldsFactory(ModelFactory[HouseHolds]):
    ...

defaults = {
    "households": {
        "beck": {
            "household_id": "beck", 
            "name": "Family Beck", 
            "familymembers": [
                {
                    "familymember_id": "linda",
                    "relatives": [
                        {
                            "household_id": "grant",
                            "familymember_id": "dad"
                        }
                    ]
                }, 
                {"familymember_id": "erik"}
            ]
        }
        ,"grant":{
            "household_id": "grant", 
            "name": "Family Grant", 
            "familymembers": [
                {"familymember_id": "dad"}, 
                {"familymember_id": "mother"}
            ]
        }
    }
}

test = HouseHoldsFactory.build(**defaults)
print(test)

Just running the build method without any defaults works, but as you can see, the relatives key of the FamilyMember class should only contain a Household id and a Familymember id that exist (ie. an id of a Household that contains a Familymember with a certain id). So there are dependencies, that's the reason of this whole dictionary of necessary defaults.

It does saves us from having to provide all the other fields, as in practice these classes could contain a ton of extra fields, which don't have dependencies, so Polyfactory could easily fake those for us.

Basic Example

Maintainer g... provided a basic solution for now:

defaults = {
    "households": {
        "beck": HouseHold(...)
        ,"grant": HouseHold(...)
}

So that should work out or I could override the build method, was his suggestion.

What I think would be ideal is:

class HouseHoldsFactory(ModelFactory[HouseHolds]):
    households: dict[str, HouseHoldFactory]

So the ModelFactory knows when it hits the key households, it actually knows how to handle that object (the keys should be a string and the values should be parsed with that HouseHoldFactory class).

Drawbacks and Impact

No response

Unresolved questions

No response


[!NOTE]
While we are open for sponsoring on GitHub Sponsors and OpenCollective, we also utilize Polar.sh to engage in pledge-based sponsorship.

Check out all issues funded or available for funding on our Polar.sh dashboard

  • If you would like to see an issue prioritized, make a pledge towards it!
  • We receive the pledge once the issue is completed & verified
  • This, along with engagement in the community, helps us know which features are a priority to our users.

Fund with Polar

ErikvdVen commented 5 months ago

I also figured it fails when providing a certain size:

class HouseHoldFactory(ModelFactory[HouseHold]):
    familymembers = Use(FamilyMemberFactory, size=10)

This also tries to generate a list and fails. For now I fixed it this way:

class HouseHoldFactory(ModelFactory[HouseHold]):
    @classmethod
    def familymembers(cls) -> dict[str, FamilyMemberFactory]:
        familymembers = FamilyMemberFactory.batch(size=10)
        return {familymember.id: familymember for familymember in familymembers}