W1ldPo1nter / django-queryable-properties

Write Django model properties that can be used in database queries.
BSD 3-Clause "New" or "Revised" License
72 stars 1 forks source link

Not compatible with sliced prefetch_related using Window functions #9

Closed SafaAlfulaij closed 11 months ago

SafaAlfulaij commented 1 year ago

Django 4.2 added support for sliced prefetch_related using Window functions.

It looks like that the current implementation around aliasing properties is having an issue with that: https://github.com/W1ldPo1nter/django-queryable-properties/blob/d189a14dc57c0854c18a1de1d51a145f4a05e7f7/queryable_properties/managers.py#L147-L163

Taking the ApplicationWithClassBasedProperties model from tests, these lines fail (below is pseudo code):

r_qs = VersionWithClassBasedProperties.objects.select_properties("released_in_2018")
qs = ApplicationWithClassBasedProperties.objects.prefetch_related(Prefetch("versions", r_qs[1:2], to_attr="test"))
print(qs)

Exception from database (postgreSQL):

django.db.utils.ProgrammingError: column "released_in_2018__" does not exist
LINE 1: ...col7", "col8", "col9", "col10", "col11", "col12", "released_in_2...
                                                             ^
HINT:  Perhaps you meant to reference the column "qualify_mask.released_in_2018".

This also shows when printing the query of a queryset, where we don't see the aliased released_in_2018__, but we see released_in_2018. When the query is actually executed, Django debug toolbar shows that it was with the aliased name.

The issue is that the inner query is not aliased released_in_2018 while the outer query is aliased released_in_2018__:


SELECT "col1",
       "col2",
       "col3",
       "col4",
       "col5",
       "col6",
       "col7",
       "col8",
       "col9",
       "col10",
       "col11",
       "col12",
       "released_in_2018__"
  FROM (
        SELECT *
          FROM (
                SELECT "model"."field" AS "col1",
                       "model"."field" AS "col2",
                       "model"."field" AS "col3",
                       "model"."field" AS "col4",
                       "model"."field" AS "col5",
                       "model"."field" AS "col6",
                       "model"."field" AS "col7",
                       "model"."field" AS "col8",
                       "model"."field" AS "col9",
                       "model"."field" AS "col10",
                       "model"."field" AS "col11",
                       "model"."field" AS "col12",
                       COALESCE((SELECT * FROM *, 0) AS "released_in_2018",
                       ROW_NUMBER() OVER (PARTITION BY "model"."related_field_id") AS "qual0"
                  FROM "model"
                 WHERE "model"."related_field_id" IN ('36744aef-3d0b-4001-8df0-2720b93cd607'::uuid)
               ) "qualify"
         WHERE ("qual0" > 1 AND "qual0" <= 2)
       ) "qualify_mask"
SafaAlfulaij commented 1 year ago

After playing around a bit, this seems to work, but it causes regressions with ordering..:

    def _setup_queryable_properties(self):
        super(QueryablePropertiesModelIterableMixin, self)._setup_queryable_properties()
        query = self.queryset.query
        select = dict(query.annotation_select)
+        annotations = dict(query.annotations)

        for property_ref, changed_name in six.iteritems(self._queryable_property_aliases):
            select[changed_name] = select.pop(six.text_type(property_ref.full_path))
+            annotations[changed_name] = annotations.pop(six.text_type(property_ref.full_path))
        setattr(query, ANNOTATION_SELECT_CACHE_NAME, select)
+        setattr(query, "annotations", annotations)
W1ldPo1nter commented 1 year ago

Thanks for the report, I'll look into it once I find some time. This could also mean that there might be problems with certain subquery scenarios in general, so I'm going to have to dive in a bit deeper. I'm going to post an update here once I've gotten around to it.

W1ldPo1nter commented 1 year ago

I did look into it and confirmed my suspicion: cases where a query pretty much "uses itself" as a subquery (like the sliced prefetch scenario) suffer from this in general.

There's also the gotcha that the code you've provided actually works without issues in the test setup, which uses SQLite. SQLite (or at least the version i was using) just doesn't seem to care about the column name mismatch. But we're obviously dealing with a bug here, which is even kind of obvious: subqueries do not use the iterable classes as they're embedded into the outer query and thus the aliasing cannot be applied the same way.

To fix this, I'm planning to get rid of the aliasing entirely, so that the column names will match automatically. This does require a new approach to setting selected property values on the created model instances. Since you're using a recent Django version, you could in theory already try the fix using this branch. I'll still have to make it work with older Django versions and add some more tests, but I hope that I can finish the fix and then release a new version soon-ish.

SafaAlfulaij commented 1 year ago

Thanks a lot for looking into this! I didn't really need to use queryable fields in subqueries a lot, but I had to use the sliced prefetcher as I'm using strawberry-graphql with nested relay paginated connections, and I had to optimize some things.

I tested out the branch on Django 4.2.7, and the issue was resolved :)

Tbh, I'm not sure why you are still supporting such old Django versions, maybe for your projects.

Thanks again and appreciated!

W1ldPo1nter commented 1 year ago

Tbh, I'm not sure why you are still supporting such old Django versions, maybe for your projects.

Yep, got some legacy projects to take care of.

The fix is now on PyPI as version 1.8.5.