GoogleCloudDataproc / spark-bigquery-connector

BigQuery data source for Apache Spark: Read data from BigQuery into DataFrames, write DataFrames into BigQuery tables.
Apache License 2.0
378 stars 198 forks source link

How to enforce BQ connector to NOT use Storage read API in occasion of reaching limits? #1301

Closed anish97IND closed 1 month ago

anish97IND commented 1 month ago

Hi Team, We have a job which is using BQ connector and while reading data its hitting a limitation of storage read API which suggests per row - data should not exceed a size of 1mb as per doc (https://cloud.google.com/bigquery/quotas#storage-limits) Error : INVALID_ARGUMENT: read_session.read_options.row_restriction exceeded maximum allowed length. Maximum bytes allowed: 1048576

  1. Is there any way we can direct bq connector to use legacy API?
  2. Any fix we could introduce to not hit the limit
isha97 commented 1 month ago

Hi @anish97IND Please open a ticket with google support to increase your quotas.

anish97IND commented 1 month ago

When you use the Storage Read API CreateReadSession call, you are limited to a maximum length of 1 MB for each row or filter.

Hi @isha97 , thanks for responding , was going over the doc, but it seems to be a limit rather than a quota, can this be increased too, since there is no mention of it in official doc. A snapshot of it is posted above. (https://cloud.google.com/bigquery/quotas#storage-limits)