spotify / spark-bigquery

Google BigQuery support for Spark, SQL, and DataFrames
Apache License 2.0
155 stars 52 forks source link

Make a defensive copy to stop polution of sparkconf #60

Closed edwardcapriolo closed 6 years ago

edwardcapriolo commented 6 years ago

The code in this project as well of the code inside the underlying input format make direct changes to the hadoop configuration. Unfortunately this causes problems for processes that have to do work in "on-premise" hdfs clusters after doing work with the "off premise" google clusters part. Essentially without this copy things are set in the sparkConfiguration that need to be unset and it is not always easy to unset them.