Closed cxzl25 closed 5 months ago
@gszadovszky @Fokko
We see build errors for 1.14.0-SNAPSHOT against Apache Spark branch.3.5:
[warn] multiple main classes detected: run 'show discoveredMainClasses' to see the list
[error] java.lang.RuntimeException: found version conflict(s) in library dependencies; some are suspected to be binary incompatible:
[error]
[error] * com.github.luben:zstd-jni:1.5.6-2 (strict) is selected over {1.5.5-4}
[error] +- org.apache.parquet:parquet-hadoop:1.14.0-SNAPSHOT (depends on 1.5.6-2)
[error] +- org.apache.spark:spark-core_2.12:3.5.2-SNAPSHOT (depends on 1.5.5-4)
[error]
I'm not sure if this is a blocking issue for the release.
FYI: zstd-jni versions on different branches of Apache Spark as of 2024-04-29: master: 1.5.6-3 branch-3.5: 1.5.5-4 branch-3.4: 1.5.2-5
@wgtmac Thanks for raising this. Looking at the history, there seem te be some important patches in there: https://github.com/luben/zstd-jni/commits/master/?before=c77a7658aeba94ccd3da52e61ce87d06e7292826+35
We can't backport this to the 3.5 branch anyway. So, we're probably good with keeping this in line with the Spark main branch. WDYT?
So we just need to pair 1.14.0 with Apache Spark 4.0.0? I'm not sure if this is a good idea. It looks better if Apache Spark 3.5.2 can adopt Parquet 1.14.0.
I think they don't allow backporting dependency upgrades in Spark unless there are CVE's :(
I'm not sure if that is the case. For Apache ORC, we use the following mapping to pair it with Apache Spark and always update the latest minor version to Apache Spark:
Apr 4, 2024 https://github.com/luben/zstd-jni/releases/tag/v1.5.6-2
Mar 28, 2024 https://github.com/luben/zstd-jni/releases/tag/v1.5.6-1
Dec 1, 2023 https://github.com/luben/zstd-jni/releases/tag/v1.5.5-11
Jira
Tests
Commits
Style
mvn spotless:apply -Pvector-plugins
Documentation