Open aalexandrov opened 11 months ago
As part of this work, we may want to additionally consider the following:
TopK
node regarding whether it is implementing a distinct on
or a limit
. This would allow us to have EXPLAIN
output that looks like the following:
CREATE TABLE teachers (id INT, name TEXT);
CREATE TABLE sections (id INT, teacher_id INT, course_id INT, schedule TEXT);
-- Illustrates that new hints apply without ambiguity in a single query block.
CREATE MATERIALIZED VIEW distinct_on_group_by_limit AS
SELECT DISTINCT ON(teacher_id) id, teacher_id, MAX(course_id)
FROM sections
GROUP BY id, teacher_id
OPTIONS (
AGGREGATE INPUT GROUP SIZE = 1000,
DISTINCT ON INPUT GROUP SIZE = 60,
LIMIT INPUT GROUP SIZE = 50)
ORDER BY teacher_id, id
LIMIT 2;
EXPLAIN PLAN FOR MATERIALIZED VIEW distinct_on_group_by_limit;
-- With lineage, we could improve the EXPLAIN output as follows:
CREATE TABLE
CREATE TABLE
CREATE MATERIALIZED VIEW
Optimized Plan
-------------------------------------------------------------------------------------
materialize.public.distinct_on_group_by_limit: +
TopK order_by=[#1 asc nulls_last, #0 asc nulls_last] limit=2 input_group_size=50 +
TopK group_by=[#1] order_by=[#0 asc nulls_last] distinct on input_group_size=60+
Reduce group_by=[#0, #1] aggregates=[max(#2)] input_group_size=1000 +
Project (#0..=#2) +
ReadStorage materialize.public.sections +
(1 row)
@vmarcos isn't distinct on
just the same as limit=1
in the presence of group_by
?
@vmarcos isn't
distinct on
just the same aslimit=1
in the presence ofgroup_by
?
@aalexandrov It's not a bidirectional mapping, unfortunately. Consider the following example:
CREATE TABLE teachers (id INT, name TEXT);
CREATE TABLE sections (id INT, teacher_id INT, course_id INT, schedule TEXT);
CREATE MATERIALIZED VIEW alt_section_of_top_course_per_teacher AS
SELECT id AS teacher_id,
(SELECT id AS section_id
FROM sections
WHERE teacher_id = teachers.id
OPTIONS (LIMIT INPUT GROUP SIZE = 1000)
ORDER BY course_id DESC
LIMIT 1)
FROM teachers;
EXPLAIN PLAN FOR MATERIALIZED VIEW alt_section_of_top_course_per_teacher;
Optimized Plan
---------------------------------------------------------------------------------------
materialize.public.alt_section_of_top_course_per_teacher: +
Return +
Project (#0, #2) +
Join on=(#0 = #1) type=differential +
ArrangeBy keys=[[#0]] +
Get l0 +
ArrangeBy keys=[[#0]] +
Union +
Get l2 +
Map (null) +
Union +
Negate +
Project (#0) +
Get l2 +
Get l1 +
With +
cte l2 = +
Project (#0, #1) +
TopK group_by=[#0] order_by=[#2 desc nulls_first] limit=1 exp_group_size=1000+
Project (#0, #1, #3) +
Filter (#0) IS NOT NULL +
Join on=(#0 = #2) type=differential +
ArrangeBy keys=[[#0]] +
Get l1 +
ArrangeBy keys=[[#1]] +
Project (#0..=#2) +
Filter (#1) IS NOT NULL +
ReadStorage materialize.public.sections +
cte l1 = +
Distinct group_by=[#0] +
Get l0 +
cte l0 = +
Project (#0) +
ReadStorage materialize.public.teachers +
+
Source materialize.public.sections +
filter=((#1) IS NOT NULL) +
(1 row)
Briefly discussed this epic w/@teskje. We believe that this epic requires a clearer product story:
Other than that, we think it'd be a nice addition to Mz to surface size information more widely.
Per Q2 Roadmap (i.e. planning) discussions at the on-site:
est: ~4~ 5 weeks
Aargh. The estimate was 5 weeks, had a typo.
Querying the introspection data as part of an explain might be surprising, specifically because we'd need to construct a dataflow to collect the data.
This would have complicated things, but fortunately in the meantime Unified Introspection will solve this: we'll just need to read from Persist. See 2. above.
Compute team agrees there are still unknowns to iron out
Some design questions off the top of my head:
***31 GB***
?Memory reporting will be implemented at the level of Physical Plan nodes
We are planning to show it on the MIR plan ("optimized plan", which the default EXPLAIN shows), because that is much more readable, and easier for the user to map back to SQL.
Part of the implementation will require moving IDs from the Optimized Plan's nodes to the Physical Plan's nodes
It's actually the other way around: We'll have to map back from LIR node ids to MIR node ids. Also, it turns out that we don't yet have a mapping even from rendered node ids to LIR, so we need to also do that here.
I've renamed this to "Plan Profiling", because we want to show not just memory usage, but other things, e.g., CPU usage.
Discussion on why rely on Unified Introspection for this: https://materializeinc.slack.com/archives/C074FURKP8W/p1718293082913619?thread_ts=1718127372.822979&cid=C074FURKP8W
https://github.com/MaterializeInc/materialize/pull/27316 is a draft PR from Alex implementing a part of this.
Product outcome
Currently, internal arrangement sizes are reported only in relation to the rendered dataflow graph that creates them. Consequently, it is hard to figure out which parts of a SQL query are the main contributors for excessive memory consumption. The current approach is to infer this manually by looking at the rendered dataflow graph. This is time consuming, error-prone, and in general impossible for people that are not well-versed in our rendering code. This approach does not scale, and at the moment is effectively limited to the members of the @MaterializeInc/compute team.
Automatically relating internal arrangements to SQL-level syntax fragments is hard, but relating them to the
MirRelationExpr
(sub-)terms produced in theOPTIMIZED PLAN
stage is a tractable problem. As an outcome of this epic, Materialize customers as well as field engineering and engineering ambassadors (who are the more likely first consumers of this feature) should be able to runcommands on existing materialized view and index items. The output will show the cumulative memory consumption (reported as the sum of the arrangement sizes) associated with each
MirRelationExpr
in theOPTIMIZED PLAN
for the explained item. For exampleEdit: Also, we'd like to show not just memory usage, but CPU usage, skewness, hydration state, input size in rows or bytes, etc. Basically, all the things that are shown in introspection relations at the level of dataflow operators, we'd like to map back to the MIR plan.
Discovery
To achieve the outcome of the epic we need to solve the following sub-problems:
OPTIMIZED PLAN
to the LIR expressions in the MIR ⇒ LIR lowering step.mz_internal
tables in a way which allows us to attach this information to themz_arrangement_sizes
data.Attribute
that can be requested with anExplainConfig
flag.Some of the sub-problems have different viable solutions and we will need to motivate our choice (probably in a design doc).
Work items
Now
Related release blockers
Related
EXPLAIN
output corners that need to be polished before we can ship the featureNext
Related feature requests that will become easier to ship by piggy-backing on implementation work done for this feature.
Decision log