Open martinstuder opened 6 years ago
The only way around the first issue that I came across so far is to shade the google classes using sbt-assembly
.
Regarding the second issue, I found that setting spark.hadoop.google.cloud.auth.service.account.json.keyfile
helps when using sqlContext.bigQueryTable
. When using sqlContext.bigQuerySelect
I seem to need to set the GOOGLE_APPLICATION_CREDENTIALS
environment variable.
I'm trying to use spark-bigquery 0.2.1 with a locally running Spark 2.2.0.
When trying to run through the spark-shell via
spark-shell --packages com.spotify:spark-bigquery_2.11:0.2.1
I run into the following exception due to incompatible versions of guava:What is the recommended way of working around this?
In another scenario I built a spark package based on spark-bigquery 0.2.1 that attempts to connect Google Bigquery (again from a locally running Spark 2.2.0 cluster). In that case, I always run into the following issue:
I tried several of the following combinations but to no avail:
setGcpJsonKeyFile
,setBigQueryProjectId
,setBigQueryGcsBucket
,setBigQueryDatasetLocation
)GOOGLE_APPLICATION_CREDENTIALS
to a valid JSON key file (before starting the local Spark cluster, inspark-env.sh
and inspark-defaults.conf
; settingGOOGLE_APPLICATION_CREDENTIALS
seems to work with https://github.com/samelamin/spark-bigquery though)mapred.bq.*
,fs.gs.*
andgoogle.cloud.*
options)