This gets all the of the spark planner working, which is to say, we can convert TypedPipe into spark_backend.Op.
In a follow up PR I have the rest of it, which is the spark Writer which manages the Execution rendering.
this leverages #1867 which was highly useful to implement join/grouping related issues without materializing into memory (at the cost of instead sorting).
This could later be a Config option such that we control if we want to use the sort-based or memory-based approach to implementing join or mapGroup, but I chose to be conservative here and use the more scalable, less memory, version (sorting).
This gets all the of the spark planner working, which is to say, we can convert TypedPipe into spark_backend.Op.
In a follow up PR I have the rest of it, which is the spark Writer which manages the Execution rendering.
this leverages #1867 which was highly useful to implement join/grouping related issues without materializing into memory (at the cost of instead sorting).
This could later be a Config option such that we control if we want to use the sort-based or memory-based approach to implementing join or mapGroup, but I chose to be conservative here and use the more scalable, less memory, version (sorting).
cc @ianoc @non