rajasekarv / vega

A new arguably faster implementation of Apache Spark from scratch in Rust
Apache License 2.0
2.23k stars 206 forks source link

Add union_rdd #46

Closed iduartgomez closed 4 years ago

iduartgomez commented 4 years ago

Preparing the PR to add union_rdd.

Still need to make a fix for the incorrect dependency graph and a couple more polishing changes when that's fixed (not really happy how I am exposing the two variants publically) but the fixes are required first before proceeding with polishing.

iduartgomez commented 4 years ago

@rajasekarv this is ready to be merged now, if you want to take a last look before merge will leave it open for a bit

iduartgomez commented 4 years ago

Fixes #41

rajasekarv commented 4 years ago

great work @iduartgomez. sorry for the delay from my side. let me quickly review it and if there is nothing else to add, I will merge this.

iduartgomez commented 4 years ago

Current situation is that we have found some deadlocking issue (probably at the scheduler), Raja was able to trace back this to older commits, even when removing all union tests (#8026dd5). I am able to reproduce it after #f5bbc8c.

Master works fine so we can use that to try pin point the current problem.