neo4j / neo4j-spark-connector

Neo4j Connector for Apache Spark, which provides bi-directional read/write access to Neo4j from Spark, using the Spark DataSource APIs
https://neo4j.com/developer/spark/
Apache License 2.0
313 stars 112 forks source link
bolt cypher hacktoberfest neo4j-connector neo4j-driver spark

Neo4j Connector for Apache Spark

This repository contains the Neo4j Connector for Apache Spark.

License

This neo4j-connector-apache-spark is Apache 2 Licensed

Documentation

The documentation for Neo4j Connector for Apache Spark lives at https://github.com/neo4j/docs-spark repository.

Building for Spark 3

You can build for Spark 3.x with both Scala 2.12 and Scala 2.13

./maven-release.sh package 2.12
./maven-release.sh package 2.13

These commands will generate the corresponding targets

Integration with Apache Spark Applications

spark-shell, pyspark, or spark-submit

$SPARK_HOME/bin/spark-shell --jars neo4j-connector-apache-spark_2.12-<version>_for_spark_3.jar

$SPARK_HOME/bin/spark-shell --packages org.neo4j:neo4j-connector-apache-spark_2.12:<version>_for_spark_3

sbt

If you use the sbt-spark-package plugin, in your sbt build file, add:

resolvers += "Spark Packages Repo" at "http://dl.bintray.com/spark-packages/maven"
libraryDependencies += "org.neo4j" % "neo4j-connector-apache-spark_2.12" % "<version>_for_spark_3"

maven

In your pom.xml, add:

<dependencies>
  <!-- list of dependencies -->
  <dependency>
    <groupId>org.neo4j</groupId>
    <artifactId>neo4j-connector-apache-spark_2.12</artifactId>
    <version>[version]_for_spark_3</version>
  </dependency>
</dependencies>

For more info about the available version visit https://neo4j.com/developer/spark/overview/#_compatibility