MaterializeInc / materialize

The Cloud Operational Data Store: use SQL to transform, deliver, and act on fast-changing data.
https://materialize.com
Other
5.71k stars 466 forks source link

Nullability inference should take into account filters _inside_ sources #21908

Open ggevay opened 11 months ago

ggevay commented 11 months ago

The following query has a nullable output, even though the select x ... where x is not null makes it look like it should be really easy to see that it won't be null:

materialize=> create table t(a int, b int);
CREATE TABLE
materialize=> explain with(types)
select x
from (select a+b as x from t)
where x is not null;
                                        Optimized Plan                                         
-----------------------------------------------------------------------------------------------
 Explained Query:                                                                             +
   Project (#2) // { types: "(integer?)" }                                                    +
     Filter (#0) IS NOT NULL AND (#1) IS NOT NULL // { types: "(integer, integer, integer?)" }+
       Map ((#0 + #1)) // { types: "(integer?, integer?, integer?)" }                         +
         ReadStorage materialize.public.t // { types: "(integer?, integer?)" }                +
                                                                                              +
 Source materialize.public.t                                                                  +
   filter=((#0) IS NOT NULL AND (#1) IS NOT NULL)                                             +

(1 row)

Note that at the beginning of the MIR pipeline we still know that the output won't be null:

materialize=> explain decorrelated plan with(types) for
select x
from (select a+b as x from t)
where x is not null;
                           Decorrelated Plan                           
-----------------------------------------------------------------------
 Filter (#0) IS NOT NULL // { types: "(integer)" }                    +
   Project (#2) // { types: "(integer?)" }                            +
     Map ((#0 + #1)) // { types: "(integer?, integer?, integer?)" }   +
       CrossJoin // { types: "(integer?, integer?)" }                 +
         Constant // { types: "()" }                                  +
           - ()                                                       +
         Get materialize.public.t // { types: "(integer?, integer?)" }+

(1 row)

There are various ways to fix this, but the simplest would be to teach the nullability inference to take into account the filters that got pushed into the source.

Edit: @aalexandrov points out that for this, we also need to tighten the contract of the source to always execute the filter.

A more general fix would be to propagate the consequences of the Filter (#0) IS NOT NULL AND (#1) IS NOT NULL down into the Map ((#0 + #1)), but this is hard because our nullability inference is a bottom-up recursion. (At least the one in typ. We have several others in ColumnKnowledge, NonNullable, NonNullRequirements, and possibly other places.)

aalexandrov commented 11 months ago

It's really weird that the "optimal" plan structure changes between optimize/local where we have

Project (#2) // { types: "(integer)" }
  Map ((#0 + #1)) // { types: "(integer, integer, integer)" }
    Filter (#0) IS NOT NULL AND (#1) IS NOT NULL // { types: "(integer, integer)" }
      Get materialize.public.t // { types: "(integer?, integer?)" }

and optimize/global

Explained Query:
  Project (#2) // { types: "(integer?)" }
    Filter (#0) IS NOT NULL AND (#1) IS NOT NULL // { types: "(integer, integer, integer?)" }
      Map ((#0 + #1)) // { types: "(integer?, integer?, integer?)" }
        ReadStorage materialize.public.t // { types: "(integer?, integer?)" }

Source materialize.public.t
  filter=((#0) IS NOT NULL AND (#1) IS NOT NULL)

At first glance I would say that the relative order of adjacent Filter and Map should not be changed after optimize/global.

aalexandrov commented 11 months ago

The Filter is lifted as part of optimize/global/logical_cleanup/fixpoint/0000/canonicalize_mfp, so in this case my take is that logical_cleanup violates the

Projections are pushed down as far as possible.

constraint of its input. I suggest to add a predicate_pushdown to the end of this phase in order to fix this.

ggevay commented 11 months ago

It's changed by CanonicalizeMfp, which is called from physical_optimizer and logical_cleanup_pass, and logical_cleanup_pass is called from optimize_dataflow.