chitralverma / scala-polars

Polars for Scala & Java projects!
https://chitralverma.github.io/scala-polars/
Apache License 2.0
62 stars 5 forks source link
arrow big-data dataframe dataframe-library java jni polars rust scala

scala-polars

scala-polars is a library for using the awesome Polars DataFrame library in Scala and Java projects.

About

About Polars

Polars is a blazing fast DataFrames library implemented in Rust using Apache Arrow Columnar Format as the memory model.

About scala-polars

This library has been written mostly in scala and leverages JNI to offload heavy data processing tasks to its native counterpart written completely in rust. The aim of this library is to provide an easy-to-use interface for Scala/ Java developers though which they can leverage the amazing Polars library in their existing projects.

The project is mainly divided into 2 submodules,

Examples

Compatibility

Building

Prerequisites

The following tooling is required to start building scala-polars,

How to Compile?

sbt is the primary build tool for this project and all the required interlinking has been done in such a way that your IntelliJ IDE or an external build works in the same way. This means that whether you are in development mode or want to build to distribute, the process of the build remains the same and is more or less abstracted.

The build process that sbt triggers involves the following steps,

All of the above steps happen automatically when you run an sbt build job that triggers compile phase. Other than this, during package phase, the scala, java code and the built rust binary is added to the built jar(s). To keep everything monolithic, the native module is not packaged as a jar, only core module is.

The above process might look complicated, and it actually is 😂, but since all the internally sbt wiring is already in place, the user facing process is fairly straight-forward. This can be done by going through the following steps in sequence firstly ensure JDK 8+, sbt and the latest rust compiler are installed, then follow the commands below as per the need.

Compilation

# To compile the whole project (scala/ java/ rust) in one go
sbt compile

Local packaging/ installation

# To package the project and install locally as slim jars with default scala version.
sbt publishLocal

# To package the project and install locally as slim jars for all supported scala versions.
sbt +publishLocal

Build Assembly (fat jar)

# To package the project and install locally as fat jars with default scala version.
sbt assembly

# To package the project and install locally as slim jars for all supported scala versions.
sbt +assembly

Generate Native Binary Only

# To compile only the native module containing rust code to binary.
sbt generateNativeLibrary

License

Apache License 2.0, see LICENSE.

Community

Reach out to the Polars community on Discord.