GoogleCloudDataproc / spark-bigquery-connector

BigQuery data source for Apache Spark: Read data from BigQuery into DataFrames, write DataFrames into BigQuery tables.
Apache License 2.0
358 stars 189 forks source link

Request to Stop Enforcing Delete Permissions on Materialization Dataset #1224

Open reynoldspravindev opened 1 month ago

reynoldspravindev commented 1 month ago

Hi, most organizations as mine, do not provide delete access to service accounts on datasets and there are no temp datasets that are allotted for the service account which can be used for materialization. This causes failures while reading from views. Would like to understand why is the materialization enforced on the user for basic operations like reading from a view. I understand that the data needs to be materialized by Big Query for obvious reasons but it makes no sense for a user to worry about this. Any help or directions on this is much appreciated. Thank you!

isha97 commented 1 month ago

Hi @reynoldspravindev,

Please check the reading from views documentation https://github.com/GoogleCloudDataproc/spark-bigquery-connector?tab=readme-ov-file#reading-from-views The Bigquery Read Session requires a table which needs to be materialized by the connector. Regarding the materialization dataset, the connector may not have access to create/delete datasets in the user account, so it requires an existing dataset from the user.