eto-ai / rikai

Parquet-based ML data format optimized for working with unstructured data
https://rikai.readthedocs.io/en/latest/
Apache License 2.0
137 stars 19 forks source link

Python Support for Spark 3.2.0 #531

Closed Renkai closed 2 years ago

Renkai commented 2 years ago

This patch closes https://github.com/eto-ai/rikai/issues/421

Pyspark only supports Scala 2.12, so this patch checks the Spark version to define macros, not assume Scala 2.12 will use Spark 3.1.x and Scala 2.13 will use Spark 3.2.x

Renkai commented 2 years ago

This patch would add Spark 3.2 to CI, but would not release a public version that supports Spark 3.2. We can change the release script later when people really need rikai for Spark 3.2