dimagi / django-cte

Common Table Expressions (CTE) for Django
Other
334 stars 46 forks source link

Handle empty result sets in CTEs #92

Closed camuthig closed 5 months ago

camuthig commented 6 months ago

If Django finds a CTE during SQL generation that will result in empty results, it will throw an EmptyResultSet immediately. Because the CTEs are compiled to SQL prior to the base query in the CTEQueryCompiler, though, the base compiler is missing core information Django needs to gracefully handle this situation, like col_count and klass_info.

To resolve this, in the case of EmptyResultSet being thrown by CTE compilation, the base as_sql will be called as well before reraising.

The creation of the base queryset can not be move before the explain behaviors in generate_sql or the generated EXPLAIN will throw an error, so the try/except pattern is used instead.

Resolves #84

I think this will also work for #64 as well, but I couldn't figure out a way to throw that particular error.

millerdev commented 6 months ago

The wrong result is returned when using LEFT OUTER JOIN on the CTE. Here's a test to demonstrate:

    def test_left_outer_join_on_empty_result_set_cte(self):
        totals = With(
            Order.objects
            .filter(id__in=[])
            .values("region_id")
            .annotate(total=Sum("amount")),
            name="totals",
        )
        orders = (
            totals.join(Order, region=totals.col.region_id, _join_type=LOUTER)
            .with_cte(totals)
            .annotate(region_total=totals.col.total)
            .order_by("amount")
        )

        self.assertEqual(len(orders), 22)
camuthig commented 6 months ago

@millerdev I believe the latest commit should handle that case. It requires more introspection to handle the join, since the join aspect isn't what is throwing the EmptyResultSet - that is coming from the CTE itself. I leveraged the aliases map to understand how the CTE is being used in the rest of the query, and then the existing elide_empty parameter on the get_compiler function to not throw those errors if those CTEs were left joins outer joins.

In the master branch, the left outer joins still caused an empty result set and resulted in the klass_info access error. In this branch, that was originally altered to silently return no results after the empty result set was handled. Now it actually runs the query, but forces it to just not have results without throwing any errors.

millerdev commented 6 months ago

Tests are failing. Unfortunately elide_empty is not available on all supported versions of Django. There is also a lint error.

camuthig commented 6 months ago

It looks like elide_empty was added in 4.0.

I have fixed the lint errors and cleaned up the code a bit to enable specifically handling the situation in the except block for the older versions of Django. This is basically the same way that elide_empty works under the hood, it just isn't as clean of an implementation in the library-space.

camuthig commented 6 months ago

Thanks for the feedback. It looks like the connector has always been optional, so I have removed it as requested and pulled in the other feedback.

millerdev commented 5 months ago

Thank you for the contribution @camuthig!

camuthig commented 5 months ago

No problem. Thanks for the awesome project @millerdev . My team is starting to lean on it more to improve our more out of control queries. Do you know when you might cut a release that includes this change? We are pinned to my branch until we can get back onto official releases, but we have a few behaviors that fall over without this fix.

millerdev commented 5 months ago

Thanks for the prompt. New release: https://pypi.org/project/django-cte/1.3.3/

SupImDos commented 5 months ago

@millerdev Not a huge deal, but would it be possible to do a tag / release on GitHub for v1.3.3 as well?

millerdev commented 5 months ago

Yes, it had been tagged, but I forgot to push it. https://github.com/dimagi/django-cte/releases/tag/v1.3.3