A library for reading and writing data in Redis using Apache Spark.
Spark-Redis provides access to all of Redis' data structures - String, Hash, List, Set and Sorted Set - from Spark as RDDs. It also supports reading and writing with DataFrames and Spark SQL syntax.
The library can be used both with Redis stand-alone as well as clustered databases. When used with Redis cluster, Spark-Redis is aware of its partitioning scheme and adjusts in response to resharding and node failure events.
Spark-Redis also supports Spark Streaming (DStreams) and Structured Streaming.
The library has several branches, each corresponds to a different supported Spark version. For example, 'branch-2.3' works with any Spark 2.3.x version. The master branch contains the recent development for the next release.
Spark-Redis | Spark | Redis | Supported Scala Versions |
---|---|---|---|
master | 3.2.x | >=2.9.0 | 2.12 |
3.0 | 3.0.x | >=2.9.0 | 2.12 |
2.4, 2.5, 2.6 | 2.4.x | >=2.9.0 | 2.11, 2.12 |
2.3 | 2.3.x | >=2.9.0 | 2.11 |
1.4 | 1.4.x | 2.10 |
This library is a work in progress so the API may change before the official release.
Please make sure you use documentation from the correct branch (2.4, 2.3, etc).
You're encouraged to contribute to the Spark-Redis project.
There are two ways you can do so:
If you encounter an issue while using the library, please report it via the project's issues tracker.
Code contributions to the Spark-Redis project can be made using pull requests. To submit a pull request: