enso-org / enso

Hybrid visual and textual functional programming.
https://enso.org
Apache License 2.0
7.31k stars 317 forks source link

Banker's rounding creates extremely complex query in Postgres #10306

Open radeusgd opened 1 week ago

radeusgd commented 1 week ago

I have just tried using round with use_bankers=True in Postgres, and the query that has got created is following:

SELECT CAST((CASE WHEN ("foo"."X" >= ?) THEN (CASE WHEN (CASE WHEN ((CAST((CASE WHEN ((CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'NaN')) OR (CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'Infinity', double precision '-Infinity'))) THEN CAST("foo"."X" AS double precision) / CAST(? AS double precision) WHEN ((CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'NaN')) OR (CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'Infinity', double precision '-Infinity'))) IS NULL THEN NULL ELSE TRUNC(CAST("foo"."X" AS double precision) / CAST(? AS double precision)) END) AS int8) % ?)) != (?) THEN (("foo"."X" % ?) >= ?) WHEN ((CAST((CASE WHEN ((CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'NaN')) OR (CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'Infinity', double precision '-Infinity'))) THEN CAST("foo"."X" AS double precision) / CAST(? AS double precision) WHEN ((CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'NaN')) OR (CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'Infinity', double precision '-Infinity'))) IS NULL THEN NULL ELSE TRUNC(CAST("foo"."X" AS double precision) / CAST(? AS double precision)) END) AS int8) % ?)) != (?) IS NULL THEN NULL ELSE (("foo"."X" % ?) > ?) END) THEN ((CAST((CASE WHEN ((CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'NaN')) OR (CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'Infinity', double precision '-Infinity'))) THEN CAST("foo"."X" AS double precision) / CAST(? AS double precision) WHEN ((CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'NaN')) OR (CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'Infinity', double precision '-Infinity'))) IS NULL THEN NULL ELSE TRUNC(CAST("foo"."X" AS double precision) / CAST(? AS double precision)) END) AS int8) * ?) + ?) WHEN (CASE WHEN ((CAST((CASE WHEN ((CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'NaN')) OR (CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'Infinity', double precision '-Infinity'))) THEN CAST("foo"."X" AS double precision) / CAST(? AS double precision) WHEN ((CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'NaN')) OR (CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'Infinity', double precision '-Infinity'))) IS NULL THEN NULL ELSE TRUNC(CAST("foo"."X" AS double precision) / CAST(? AS double precision)) END) AS int8) % ?)) != (?) THEN (("foo"."X" % ?) >= ?) WHEN ((CAST((CASE WHEN ((CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'NaN')) OR (CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'Infinity', double precision '-Infinity'))) THEN CAST("foo"."X" AS double precision) / CAST(? AS double precision) WHEN ((CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'NaN')) OR (CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'Infinity', double precision '-Infinity'))) IS NULL THEN NULL ELSE TRUNC(CAST("foo"."X" AS double precision) / CAST(? AS double precision)) END) AS int8) % ?)) != (?) IS NULL THEN NULL ELSE (("foo"."X" % ?) > ?) END) IS NULL THEN NULL ELSE (CAST((CASE WHEN ((CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'NaN')) OR (CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'Infinity', double precision '-Infinity'))) THEN CAST("foo"."X" AS double precision) / CAST(? AS double precision) WHEN ((CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'NaN')) OR (CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'Infinity', double precision '-Infinity'))) IS NULL THEN NULL ELSE TRUNC(CAST("foo"."X" AS double precision) / CAST(? AS double precision)) END) AS int8) * ?) END) WHEN ("foo"."X" >= ?) IS NULL THEN NULL ELSE (CASE WHEN (CASE WHEN ((CAST((CASE WHEN ((CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'NaN')) OR (CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'Infinity', double precision '-Infinity'))) THEN CAST("foo"."X" AS double precision) / CAST(? AS double precision) WHEN ((CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'NaN')) OR (CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'Infinity', double precision '-Infinity'))) IS NULL THEN NULL ELSE TRUNC(CAST("foo"."X" AS double precision) / CAST(? AS double precision)) END) AS int8) % ?)) = (?) THEN (("foo"."X" % ?) < ?) WHEN ((CAST((CASE WHEN ((CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'NaN')) OR (CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'Infinity', double precision '-Infinity'))) THEN CAST("foo"."X" AS double precision) / CAST(? AS double precision) WHEN ((CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'NaN')) OR (CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'Infinity', double precision '-Infinity'))) IS NULL THEN NULL ELSE TRUNC(CAST("foo"."X" AS double precision) / CAST(? AS double precision)) END) AS int8) % ?)) = (?) IS NULL THEN NULL ELSE (("foo"."X" % ?) <= ?) END) THEN ((CAST((CASE WHEN ((CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'NaN')) OR (CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'Infinity', double precision '-Infinity'))) THEN CAST("foo"."X" AS double precision) / CAST(? AS double precision) WHEN ((CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'NaN')) OR (CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'Infinity', double precision '-Infinity'))) IS NULL THEN NULL ELSE TRUNC(CAST("foo"."X" AS double precision) / CAST(? AS double precision)) END) AS int8) * ?) - ?) WHEN (CASE WHEN ((CAST((CASE WHEN ((CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'NaN')) OR (CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'Infinity', double precision '-Infinity'))) THEN CAST("foo"."X" AS double precision) / CAST(? AS double precision) WHEN ((CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'NaN')) OR (CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'Infinity', double precision '-Infinity'))) IS NULL THEN NULL ELSE TRUNC(CAST("foo"."X" AS double precision) / CAST(? AS double precision)) END) AS int8) % ?)) = (?) THEN (("foo"."X" % ?) < ?) WHEN ((CAST((CASE WHEN ((CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'NaN')) OR (CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'Infinity', double precision '-Infinity'))) THEN CAST("foo"."X" AS double precision) / CAST(? AS double precision) WHEN ((CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'NaN')) OR (CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'Infinity', double precision '-Infinity'))) IS NULL THEN NULL ELSE TRUNC(CAST("foo"."X" AS double precision) / CAST(? AS double precision)) END) AS int8) % ?)) = (?) IS NULL THEN NULL ELSE (("foo"."X" % ?) <= ?) END) IS NULL THEN NULL ELSE (CAST((CASE WHEN ((CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'NaN')) OR (CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'Infinity', double precision '-Infinity'))) THEN CAST("foo"."X" AS double precision) / CAST(? AS double precision) WHEN ((CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'NaN')) OR (CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'Infinity', double precision '-Infinity'))) IS NULL THEN NULL ELSE TRUNC(CAST("foo"."X" AS double precision) / CAST(? AS double precision)) END) AS int8) * ?) END) END) AS float8) AS "round([X])" FROM "foo" AS "foo"

This is all to round just one column.

I can't even fit it on one screen: image

The query contains 112(!) interpolated constants.

I think we should try to reducing the complexity of this query at least a bit.

  1. Firstly, we can surely 'bake-in' these constants into the query as numeric literals (we already mostly support that through SQL_Expression.Literal. This will remove so many ? interpolations and their associated CASTs - making the query much shorter and more readable as we will know which constant is used where. It does not seem like a good idea to use so many interpolations overall.
  2. If possible, it would be good if we could somehow refactor duplicated sub-expressions.
    • CAST("foo"."x" AS float) appears 72 times.
    • CAST("foo"."X" AS double precision) / CAST(? AS double precision) in (double precision 'NaN') seems to appear 24 times.
    • if I read correctly CAST("foo"."x" AS float) / CAST(? AS float) in ('Infinity'::float, '-Infinity'::float) appears another 24 times.
    • If we implement #7983, we could use CTE to encapsulate these common expressions, making the whole query much simpler and more readable.
    • Even without CTE, we could avoid the duplication by first giving the nested expressions by computing them in a nested SELECT so that the outer layer query can then reference them by name. This is possible to implement right now.