Closed wangrunji0408 closed 5 months ago
When querying from a view, build executors for all views and then build other plan nodes on top of them. Given that a view can be consumed by multiple downstream nodes, we introduce StreamSubscriber to allow multiple consumers of a stream.
Well, then we have DAG in the system 🤪 I thought an easier way is just to create multiple copies of the plan and then execute it multiple times.
When querying from a view, build executors for all views and then build other plan nodes on top of them. Given that a view can be consumed by multiple downstream nodes, we introduce StreamSubscriber to allow multiple consumers of a stream.
Well, then we have DAG in the system 🤪 I thought an easier way is just to create multiple copies of the plan and then execute it multiple times.
I think it is natural to have DAG in data processing pipelines. It'd be better to reuse results from common upstreams. Another interesting fact is that a query plan in egg's e-graph is also a DAG, even if you don't construct DAG intentionally. Because egg can automatically identify and merge equal nodes. This will make it easier to eliminate common subexpressions and even reuse CTEs in the future.
This PR is a part of #796, adds support for creating, querying and dropping views in memory.
The key implementations are:
StreamSubscriber
to allow multiple consumers of a stream.Limitations: