SQLAlchemy Relationships fail when using secondary Table

floan commented 1 year ago

Hey! Thanks for your great work on this!

I've queried SQLAlchemy Relationships with this library which works just fine, but fails when I use a secondary table. Example:

class Account():
    ...
    practice = relationship("Practice", secondary="user_practice", back_populates="accounts", uselist=False)

class Practice():
    ...
    accounts = relationship("Account", secondary="user_practice", back_populates="practice")

user_practice = Table(
    ...
    Column("account_id", ForeignKey("account.id", ondelete="CASCADE"), primary_key=True, unique=True),
    Column("practice_id", ForeignKey("practice.id", ondelete="CASCADE"), primary_key=True, index=True),
)

My query is something like this:

{
    accounts {
        edges {
            node {
                practice {
                    id
                }
            }
        }
    }
}

The library tries to look for account_id in the Practice table rather than the secondary table. I've also attached part of a stack trace below:

  File "/home/vscode/.local/lib/python3.11/site-packages/strawberry_sqlalchemy_mapper/mapper.py", line 420, in resolve
    related_objects = await loader.loader_for(relationship).load(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/vscode/.local/lib/python3.11/site-packages/strawberry/dataloader.py", line 251, in dispatch_batch
    values = await loader.load_fn(keys)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/backend/routers/graphql/mapper.py", line 109, in load_fn
    loaded = await loader.load_many(keys)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/vscode/.local/lib/python3.11/site-packages/strawberry/dataloader.py", line 251, in dispatch_batch
    values = await loader.load_fn(keys)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/vscode/.local/lib/python3.11/site-packages/strawberry_sqlalchemy_mapper/loader.py", line 49, in load_fn
    grouped_keys[group_by_remote_key(row)].append(row)
                 ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/vscode/.local/lib/python3.11/site-packages/strawberry_sqlalchemy_mapper/loader.py", line 41, in group_by_remote_key
    [
  File "/home/vscode/.local/lib/python3.11/site-packages/strawberry_sqlalchemy_mapper/loader.py", line 42, in <listcomp>
    getattr(row, remote.key)
AttributeError: 'Practice' object has no attribute 'account_id'

Upvote & Fund

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

TimDumol commented 1 year ago

I might take a stab at fixing this maybe next week, but as a workaround, you can get this working by adding a UserPractice model (so explicitly modeling the join table), and then using an association_proxy instead.

RubenLagrouw commented 1 year ago

+1, I ran into the same issue

Philaeux commented 1 year ago

I don't get any error with an association table, however I don't get the result either. My example take entries associated with tags with an association table. My request is the following one:

{
  entry(entryId: "d317ae6b-a360-4b16-b223-7297ac9d0276") {
    id
    tags {
      edges {
        node {
          id
        }
      }
    }
  }
}

The result I get with such a query

@strawberry.type
class QueryEntry:

    @strawberry.field
    def entry(self, info, entry_id: uuid.UUID) -> EntryGQL | None:
        with info.context['session_factory']() as session:
            db_entry = session.query(Entry).where(Entry.id == entry_id).one_or_none()
            # print(len(db_entry.tags)) # (1)
            return db_entry

Gets me this result

{
  "data": {
    "entry": {
      "id": "d317ae6b-a360-4b16-b223-7297ac9d0276",
      "tags": {
        "edges": []
      }
    }
  }
}

But if I un-comment the python comment number (1), I get this result.

{
  "data": {
    "entry": {
      "id": "d317ae6b-a360-4b16-b223-7297ac9d0276",
      "tags": {
        "edges": [
          {
            "node": {
              "id": "049abb56-faf3-49d7-848f-e969fbcd7735"
            }
          }
        ]
      }
    }
  }
}

So it looks like lazy loading is not working but the link works.

fruitymedley commented 5 months ago

I wanted to check in on the status of this issue, as it is a problem I am encountering myself, and I may have a solution.

For context, I have an ORM like so:

categoryParentChild = Table(
    "category_parent_child",
    . . .
    Column("parent_id", ForeignKey("category.id")),
    Column("child_id", ForeignKey("category.id")),
)

class Category(Base):
    . . .
    parent: MappedColumn[Category] = relationship(
        "category",
        secondary="category_parent_child",
        primary_join="category.c.id == category_parent_child.c.child_id",
        secondary_join="category.c.id == category_parent_child.c.parent_id",
        uselist=False,
    )
    . . .

Looking at loader.py:66-73

query = select(related_model).filter(
    tuple_(
        *[remote for _, remote in relationship.local_remote_pairs or []]
    ).in_(keys)
)
if relationship.order_by:
    query = query.order_by(*relationship.order_by)
rows = await self._scalars_all(query)

this seems to be the culprit. Say I wanted to get category -> parent. In this function, we select the related model (also of type category in this case, but could be different as well). This populates category into the select statement. Next, we get the pseudo-join statement in the form of relationship.local_remote_pairs. In a primary join, we expect the first term of these pairs to be from the source table (child category), which we don't care about and discard (hence _). Rather, we take the second term - assumed to be the target table's join column - and filter on that, to which we pass keys, which are the child foreign keys (in my example, category.id).

For a primary join, this works, but we don't have a primary join, we have a secondary join, which contains not one, but two local_remote_pairs. The first, between the source and the secondary table (category.id -> category_parent_child.child_id) and another between the target and the secondary table (category.id -> category_parent_child.parent_id). Discarding the first term in each pair, as in the code, nets us roughly the following query:

select *
from category, category_parent_child -- note the lack of an 'on' clause
where (category_parent_child.child_id, category_parent_child.parent_id) in (:keys)

Without an 'on' clause, we get no means of relating the target to the secondary table, and, as such, fail. The relationship object does give us access to the secondary table, though, meaning verifying this is feasible. If no solution has been produced, might I take a stab at this?

fruitymedley commented 5 months ago

Looks like someone beat me to identifying this issue, but I'm not seeing the changes from this merge.

Ninefiveblade commented 5 months ago

I wanted to check in on the status of this issue, as it is a problem I am encountering myself, and I may have a solution.

For context, I have an ORM like so:
categoryParentChild = Table(
    "category_parent_child",
    . . .
    Column("parent_id", ForeignKey("category.id")),
    Column("child_id", ForeignKey("category.id")),
)

class Category(Base):
    . . .
    parent: MappedColumn[Category] = relationship(
        "category",
        secondary="category_parent_child",
        primary_join="category.c.id == category_parent_child.c.child_id",
        secondary_join="category.c.id == category_parent_child.c.parent_id",
        uselist=False,
    )
    . . .
Looking at loader.py:66-73
query = select(related_model).filter(
    tuple_(
        *[remote for _, remote in relationship.local_remote_pairs or []]
    ).in_(keys)
)
if relationship.order_by:
    query = query.order_by(*relationship.order_by)
rows = await self._scalars_all(query)
this seems to be the culprit. Say I wanted to get category -> parent. In this function, we select the related model (also of type category in this case, but could be different as well). This populates category into the select statement. Next, we get the pseudo-join statement in the form of relationship.local_remote_pairs. In a primary join, we expect the first term of these pairs to be from the source table (child category), which we don't care about and discard (hence _). Rather, we take the second term - assumed to be the target table's join column - and filter on that, to which we pass keys, which are the child foreign keys (in my example, category.id).

For a primary join, this works, but we don't have a primary join, we have a secondary join, which contains not one, but two local_remote_pairs. The first, between the source and the secondary table (category.id -> category_parent_child.child_id) and another between the target and the secondary table (category.id -> category_parent_child.parent_id). Discarding the first term in each pair, as in the code, nets us roughly the following query:
select *
from category, category_parent_child -- note the lack of an 'on' clause
where (category_parent_child.child_id, category_parent_child.parent_id) in (:keys)
Without an 'on' clause, we get no means of relating the target to the secondary table, and, as such, fail. The relationship object does give us access to the secondary table, though, meaning verifying this is feasible. If no solution has been produced, might I take a stab at this?

Did you solved this issue? Have same problem and think as you

fruitymedley commented 5 months ago

@Ninefiveblade not yet. I can raise a PR with the fix, though.

Ckk3 commented 5 months ago

@Ninefiveblade not yet. I can raise a PR with the fix, though.

Hi, @fruitymedley , I'm not a maintainer but I really support you to create a PR with the fix. This problem is around for a while and resolve this will be very good.

strawberry-graphql / strawberry-sqlalchemy

SQLAlchemy Relationships fail when using secondary Table #19

Upvote & Fund