Closed gregberns closed 2 years ago
Materialize uses DD exclusively for its data processing, so tbh I'd recommend looking at the plans it generates. You can inspect these from the dataflow visualizer (the "hierarchical memory visualizer" at localhost:6875
, or localhost:6876
depending on which version you have).
In your example, you would want to avoid the cross join (generally you want to avoid them) and instead follow the chain of equality constraints. Probably you would filter a1, equijoin with a2, equijoin with b2, then equijoin with b1. No cross join required. MZ will handle all of this reasoning for you, and you can even use EXPLAIN <sql query>
for it to show you the join plan it would implement in DD.
Your second query is what Materialize calls a "temporal filter". It works out great there; again I'd check the query plans!
Awesome, forgot I could flip the joins to get rid of the cross join - worked perfectly.
Thx for the tip to look at the Materialize Explain. After digging through this with DD and the complexity of the queries I need to replicate, sounds like I'm going to be playing with Materialize soon!
Just started playing with differential again, and trying to understand how to do similar things as Materialize - join relational data in potentially complex ways. I've got some existing queries that I'm using as a 'platform' to play with the lib - and the queries may not be 'good', so trying to understand how this might be done with "Dataflow Thinking".
This is a two part question, but it feels like there's a 'unified solution':
CROSS JOIN
to tie two different data sets together, whereB
is constrained by the values returned fromA
?Query 1:
Query 2:
In 'normal' programming, it seems like the values from the
A
tables could be gathered in Query 1, then passed as parameters into theCROSS JOIN
part of the query - and then thetableA*
stuff can be removed.How would this be approached in Differential?
I ran into
scope.region
which seems like it might be helpful - but haven't wrapped my head around it yet.