tiangolo / sqlmodel

SQL databases in Python, designed for simplicity, compatibility, and robustness.
https://sqlmodel.tiangolo.com/
MIT License
13.65k stars 615 forks source link

Selectin for the SQLModels relations #444

Open busykoala opened 1 year ago

busykoala commented 1 year ago

First Check

Commit to Help

Example Code

class Team(SQLModel):
    id: int = Field(default=None, primary_key=True)
    heroes: List["Hero"] = Relationship(back_populates="team", sa_relationship_kwargs={"lazy": "selectin"})

class Hero(SQLModel):
    name: str

    team_id: Optional[int] = Field(default=None, foreign_key="team.id")
    team: Optional[Team] = Relationship(back_populates="incidents")

app = FastAPI()

@app.get("/teams/", response_model=List[Team])
def get_teams() -> List[Team]:
    with Session(engine) as session:
        return session.exec(select(Team)).all()

desired_output = [
    {
        "id": 1,
        "heroes": [
            {"name": "Batman"},
            {"name": "Superman"}
        ]
    }
]

Description

When returning a list of Teams the Relationship fields are not covered in the output. I believe the response is using pydantic.main.BaseModel.json() to get the data (and there it will only cover non-Relationship fields).

It would be great if there was an option to also return the Relationship fields especially if {"lazy": "selectin"} was enabled for the relation. It's an easy workaround to walk the objects and return the complete data as json in the endpoint, but since we already have the models that is not very satisfying (also because the response_model for openapi will not be easily achievable).

Did I miss an option to enable having the Relationship attributes in the response, or is that something that could be worked on? I would also be happy to contribute but would need a pointer on where to start.

Operating System

Linux

Operating System Details

No response

SQLModel Version

0.0.7

Python Version

3.10.4

Additional Context

No response

daniil-berg commented 1 year ago

First off, if you haven't already, I suggest reading this section of the docs. It addresses the issue of infinite recursion (among other things) that can easily happen, if you don't think carefully about what data you actually want to get.

Second, I assume you forgot to copy the table=True in your example models. Without those, you can't use the models for any database queries.

The cyclic dilemma

There are already quite a few issues around this topic of including relationships in such a way. I haven't looked deeply into the code for this yet. But as I understand it, it is not at all clear, how exactly you should tackle the problem of cyclical relationship graphs. Not that there aren't any theoretical solutions. It seems to be more of a question of what a "sensible" approach may look like. In the end, it would have to be up to the user to decide, when exactly to stop walking along the relationship graph.

The most straightforward solution, and the one used right now is to simply avoid the issue altogether and force the user to e.g. set up his own response models (not tables) with the desired nesting of sub-models. (See the section "What Data to Include" on the aforementioned docs page.)

Be explicit yourself

So I would suggest closely following the advice about model inheritance from the docs for now. That means for your desired output:

from typing import List, Optional

from fastapi import FastAPI, Depends
from sqlmodel import Field, Relationship, Session, SQLModel, create_engine, select

class TeamBase(SQLModel):
    name: str = Field(index=True)
    ...

class Team(TeamBase, table=True):
    id: Optional[int] = Field(default=None, primary_key=True)
    heroes: List["Hero"] = Relationship(back_populates="team")

class TeamRead(TeamBase):
    id: int

class HeroBase(SQLModel):
    name: str = Field(index=True)
    ...
    team_id: Optional[int] = Field(default=None, foreign_key="team.id")

class Hero(HeroBase, table=True):
    id: Optional[int] = Field(default=None, primary_key=True)
    team: Optional[Team] = Relationship(back_populates="heroes")

class HeroRead(HeroBase):
    id: int

class TeamReadWithHeroes(TeamRead):
    heroes: List[HeroRead] = []

app = FastAPI()

sqlite_url = "sqlite:///test.db"
engine = create_engine(sqlite_url, echo=True)

def create_db_and_tables():
    SQLModel.metadata.create_all(engine)

def get_session():
    with Session(engine) as session:
        yield session

@app.on_event("startup")
def on_startup():
    create_db_and_tables()

@app.get("/teams/", response_model=List[TeamReadWithHeroes])
def get_teams(*, session: Session = Depends(get_session)) -> List[Team]:
    return session.exec(select(Team)).all()

Under the hood, FastAPI will basically call the TeamReadWithHeroes.from_orm() method on each Team instance returned by the database query. This will populate the heroes field on that model and give you the desired result:

[
  {
    "name": "theteam",
    "id": 1,
    "heroes": [
      {
        "name": "foo",
        "team_id": 1,
        "id": 1
      },
      {
        "name": "bar",
        "team_id": 1,
        "id": 2
      }
    ]
  }
]
busykoala commented 1 year ago

Thanks for the elaborate reply! table=True is indeed missing sorry for that mistake.

Cyclic Dilemma I've read the documentation regarding the issue, it would be up to the implementation whether that is an issue or not. It would only be one if the table design is accordingly.

Regarding the Suggestion The suggested code would in my case duplicate every single model with the exact same attributes, in which case it would probably be smarter to work with sqlalchemy and pydantic only.

Thanks for the explanation and hopefully there will be a possibility in the future to directly decide upon whether to include the relations or not.

daniil-berg commented 1 year ago

Yes, I know what you mean. I've been thinking about this a lot recently.

Maybe someone will figure out an intuitive design for this. Because I don't think the issue here is that implementation would be hard. It's the question of "how should the interface look"? If you or anyone else has an idea, like "could we maybe have a function like this ...", I am genuinely interested.

Regarding duplication, I can't imagine that it would actually force you to duplicate every attribute in every case. :eyes: I don't know your specific use case, but I can't imagine SQLModel not leading to significantly less code (compared to Pydantic+SQLAlchemy). Even with the relationship issue... But you'll know best what works for you.

antont commented 1 year ago

Yeah @busykoala how does it duplicate, when with the inheritance you get reuse? I'm using this pattern quite happily.

class TeamReadWithHeroes(TeamRead):
    heroes: List[HeroRead] = []
busykoala commented 1 year ago

As a workaround, I played around a bit and the below code was the outcome. The SQLModel class could provide something similar (implemented a bit cleaner) to walk the objects. Also, this way an additional counter variable would make it easy to stop infinite recursion if that was desired.

The below code only walks relations that have lazy selectin enabled like Relationship(back_populates="team", sa_relationship_kwargs={"lazy": "selectin"}), if not enabled the relations will be skipped.

I've provided an example serializer to show how that could be customized if needed as well.

# --------------------------------------------------------------------
# serializers.py
# --------------------------------------------------------------------
from datetime import datetime

serializers = {
    datetime: lambda x: x.isoformat()
}

# --------------------------------------------------------------------
# sqlmodel_walk.py
# --------------------------------------------------------------------
from typing import Any
from typing import Dict
from typing import List
from typing import Union

from sqlmodel import SQLModel

class SQLModelWalk:
    def __init__(self, models: Union[List[SQLModel], SQLModel], serializers: Union[Dict[Any, Any], None] = None):
        self.serializers = serializers if serializers else {}
        self.data: Union[List, Dict] = self._add_current(models, [] if isinstance(models, List) else {})

    def _add_current(self, models: Union[List[SQLModel], SQLModel], parent: Union[List, Dict]) -> Union[List, Dict]:
        # handle list of objects
        if isinstance(models, List) and isinstance(parent, List):
            parent.extend([self._add_current(obj, {}) for obj in models])
            return parent
        # handle single objects
        keys = [attr for attr in vars(models).keys() if "_sa_instance_state" not in attr]
        for key in keys:
            current_attr = getattr(models, key)
            if isinstance(current_attr, List):
                current_attr = [self._add_current(obj, {}) for obj in current_attr]
            if isinstance(current_attr, SQLModel):
                current_attr = self._add_current(current_attr, {})
            if isinstance(current_attr, tuple(self.serializers.keys())):
                current_attr = self._serialize_types(current_attr)
            if not isinstance(parent, Dict):
                raise Exception("Walk error (parent is not a dict).")
            parent[key] = current_attr
        return parent

    def _serialize_types(self, field: Any) -> str:
        serializer = self.serializers.get(type(field))
        if not serializer:
            raise Exception("No matching serializer found for data.")
        serialized_data = serializer(field)
        if not isinstance(serialized_data, str):
            raise Exception("Serializer not converting to string.")
        return serialized_data

I'm probably forgetting about many things, but that might be a start anyways.

xavipolo commented 1 year ago

Does anyone have the models separated in different files? When I separate the Hero from the Team in two different files I get an error when open "/docs/"

File "/home/xpolo/dev/python/fastapi-test/.env/lib/python3.10/site-packages/fastapi/utils.py", line 35, in get_model_definitions
    m_schema, m_definitions, m_nested_models = model_process_schema(
  File "pydantic/schema.py", line 580, in pydantic.schema.model_process_schema
  File "pydantic/schema.py", line 621, in pydantic.schema.model_type_schema
  File "pydantic/schema.py", line 254, in pydantic.schema.field_schema
  File "pydantic/schema.py", line 461, in pydantic.schema.field_type_schema
  File "pydantic/schema.py", line 847, in pydantic.schema.field_singleton_schema
  File "pydantic/schema.py", line 698, in pydantic.schema.field_singleton_sub_fields_schema
  File "pydantic/schema.py", line 526, in pydantic.schema.field_type_schema
  File "pydantic/schema.py", line 921, in pydantic.schema.field_singleton_schema
  File "/usr/lib/python3.10/abc.py", line 123, in __subclasscheck__
    return _abc_subclasscheck(cls, subclass)
TypeError: issubclass() arg 1 must be a class

In one file all is running ok, but not with separated files. All is running OK with separated files, without the HeroReadWithTeam / TeamReadWithHeroes

Note: I'm using TYPE_CHECKING with imports.