airbytehq / airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
https://airbyte.com
Other
15.53k stars 4k forks source link

[destination-google-sheets] Append + Deduped causes request limit exceeding Details: "Cannot exceed 100000 requests in a single batch. #42904

Open dmytrozeleny opened 1 month ago

dmytrozeleny commented 1 month ago

Connector Name

Google Sheets

Connector Version

0.2.6

What step the error happened?

During the sync

Relevant information

When updating two sheets daily with ~30 rows and ~80 rows respectively using Append + Deduped incremental strategy I started receiving the following error:

<HttpError 400 when requesting https://sheets.googleapis.com/v4/spreadsheets/<sheet_id>:batchUpdate?fields=%2A&alt=json returned "Cannot exceed 100000 requests in a single batch.". Details: "Cannot exceed 100000 requests in a single batch.">

Replication took 6 attempts and all 6 logs show the same error. Switching to the Append (without Deduped) strategy helped. The issue may happen on the deduplication stage.

Relevant log output

2024-07-29 05:26:11 destination > <HttpError 400 when requesting https://sheets.googleapis.com/v4/spreadsheets/<sheet_id>:batchUpdate?fields=%2A&alt=json returned "Cannot exceed 100000 requests in a single batch.". Details: "Cannot exceed 100000 requests in a single batch.">
Traceback (most recent call last):
  File "/airbyte/integration_code/main.py", line 11, in <module>
    DestinationGoogleSheets().run(sys.argv[1:])
  File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/destinations/destination.py", line 119, in run
    for message in output_messages:
  File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/destinations/destination.py", line 113, in run_cmd
    yield from self._run_write(config=config, configured_catalog_path=parsed_args.catalog, input_stream=wrapped_stdin)
  File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/destinations/destination.py", line 49, in _run_write
    yield from self.write(config=config, configured_catalog=catalog, input_messages=input_messages)
  File "/airbyte/integration_code/destination_google_sheets/destination.py", line 83, in write
    writer.deduplicate_records(configured_stream)
  File "/airbyte/integration_code/destination_google_sheets/writer.py", line 90, in deduplicate_records
    self.spreadsheet.remove_duplicates(stream, rows_to_remove)
  File "/airbyte/integration_code/destination_google_sheets/spreadsheet.py", line 98, in remove_duplicates
    stream.link()
  File "/usr/local/lib/python3.9/site-packages/pygsheets/worksheet.py", line 198, in link
    self.update_cells(tmp_data_grid)
  File "/usr/local/lib/python3.9/site-packages/pygsheets/utils.py", line 163, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/pygsheets/worksheet.py", line 758, in update_cells
    self.client.sheet.batch_update(self.spreadsheet.id, requests)
  File "/usr/local/lib/python3.9/site-packages/pygsheets/sheet.py", line 119, in batch_update
    return self._execute_requests(request)
  File "/usr/local/lib/python3.9/site-packages/pygsheets/sheet.py", line 495, in _execute_requests
    response = request.execute(num_retries=self.retries)
  File "/usr/local/lib/python3.9/site-packages/googleapiclient/_helpers.py", line 131, in positional_wrapper
    return wrapped(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/googleapiclient/http.py", line 937, in execute
    raise HttpError(resp, content, uri=self.uri)
googleapiclient.errors.HttpError: <HttpError 400 when requesting https://sheets.googleapis.com/v4/spreadsheets/<sheet_id>:batchUpdate?fields=%2A&alt=json returned "Cannot exceed 100000 requests in a single batch.". Details: "Cannot exceed 100000 requests in a single batch.">

Contribute

jessica-bauer commented 3 weeks ago

I'm also experiencing this issue with the Full Refresh + Overwrite sync mode with MySQL as my source and Google Sheets as my destination.