Open maccam912 opened 3 years ago
Thank you for filling out this request. Spark does provide a method for materializing computation. As long as Vaex provides similar one there shouldn't be any issues because of laziness. Note that author of Vaex @maartenbreddels was already reproducing this benchmark, and was planning to make PR, as mentioned in #150.
@jangorecki @maartenbreddels Looks like #150 is closed, was there ever a PR to include Vax in the benchmarks? I don't see it in the latest generated report.
no
This is planned 🙂
Hi, I'm looking at this and at the linked pull request, and the reason why the vaex comparison still isn't included is a bit over my head. Is that comparison available somewhere else? Or a similar one that does include vaex?
Just found this, thanks for this repo! I see others have offered new projects that could be included in the comparison. Might I suggest Vaex as a future contender?
https://vaex.readthedocs.io/en/latest/
It's lazy, so I'm not sure what magic needs to be done to get actual times for group-by and join. Presumably whatever Spark is doing now that I think about it.
I will now attempt to figure out how to add a tag to this issue.