Closed joan38 closed 1 year ago
Hey @joan38 ‐ the changes look good. It is around the same code lines where I did some changes in my PR. But I think those changes should be independent. Could you also check against spark v3.5.0 if that code is still the same? Also a small unit test that only works together with your changes would be great I think. Cheers!
@michael72 Spark 3.5.0 completely changed how encoders work and introduced new Expression
s. If we do what they do, this lib will not work with older Spark 3 versions.
I added more stuff that was breaking joinWith
and a test.
@joan38 I currently don't have a laptop with me so it's difficult to check, but I thought between 3 4.1 and 3.5.0 they moved some code around, but in essence it was the same, but I could be completely wrong. However taking an older version like 3.3.x as a reference is fine, but completely ignoring the newest feels wrong to me also because this library tries to remedy the very fact that evolvement of Scala was completely ignored in spark! From my POV the PR is fine though
I checked out this PR and rebased on my spark version matrix PR.
compile
and test
are successful but runMain
on StarWars fails.
@vincenzobaz the Starwars example wouldn't work with my PR either because of the type checks applied. That example would have to be reworked. Did you copy that from somewhere or was it "handcrafted"? 😉
halfway copied halfway hand crafted :laughing: We can change it as much as we want, I do not have strong feelings about it.
In fact it is quite ugly
The newly added thing is this whole AgnosticEncoder
: https://github.com/apache/spark/blob/master/sql/api/src/main/scala/org/apache/spark/sql/catalyst/encoders/AgnosticEncoder.scala#L114
If I'm not mistaken, I don't see how we can adopt it without requiring Spark 3.5.
Well if necessary we can copy it - or just everything that we need - same as I did (more or less) with the Helper file. Lot of this project copies parts of the spark project that fit with the encoder/decoder logic.
Ok, feel free to copy this PR and modify as need to move forward.
On our side we copied the library in our codebase since joinWith
s are broken in the current state of this library. But we'd be happy to depend on it again once this is fixed.
Hi, did we come to a decision on this? Should we move forward with this version in order to unlock joins?
Superseded by #45
Copied from https://github.com/apache/spark/blob/v3.3.1/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala#L389-L395
Fixes #46