Add support for Spark (reusing Hive code)

GoogleCloudDataproc / dataproc-jdbc-connector

Apache License 2.0

3 stars 8 forks source link

Spark has a fork of HiveServer2 it uses to support JDBC: https://jaceklaskowski.gitbooks.io/mastering-spark-sql/content/spark-sql-thrift-server.html. And you use Hive's JDBC client to interact with it.

This means that the bulk of what we need is already done. Here are remaining TODOs afaik:

1) Create and document an init action to start a Spark thrift server. Part of that init action will be to configure Knox to expose the Spark thrift server.

2) Change the JDBC connector to accept jdbc:dataproc://spark and translate it into using the component gateway path for Spark.

3) Update the README to reflect this.

GoogleCloudDataproc / dataproc-jdbc-connector

Add support for Spark (reusing Hive code) #14