airbytehq / airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
https://airbyte.com
Other
15.45k stars 3.98k forks source link

[source-google-sheets] silently fails when reaching requests limit #36900

Closed henriquemeloo closed 2 weeks ago

henriquemeloo commented 5 months ago

Connector Name

source-google-sheets

Connector Version

0.4.0

What step the error happened?

During the sync

Relevant information

When the connector reaches rate limits and cannot continue the sync, it exits as success, so you have a successful job and incorrect data in the destination.

Relevant log output

2024-03-28 21:07:54 source > Fetching range digital_media!38857:39177
2024-03-28 21:07:54 source > Backing off get_values(...) for 0.8s (googleapiclient.errors.HttpError: <httperror>)
2024-03-28 21:07:54 source > Increasing number of records fetching due to rate limits. Current value: 330
2024-03-28 21:07:55 source > Fetching range digital_media!38857:39187
2024-03-28 21:07:55 source > Backing off get_values(...) for 1.7s (googleapiclient.errors.HttpError: <httperror>)
2024-03-28 21:07:55 source > Increasing number of records fetching due to rate limits. Current value: 340
2024-03-28 21:07:56 source > Fetching range digital_media!38857:39197
2024-03-28 21:07:56 source > Backing off get_values(...) for 0.8s (googleapiclient.errors.HttpError: <httperror>)
2024-03-28 21:07:56 source > Increasing number of records fetching due to rate limits. Current value: 350
2024-03-28 21:07:57 source > Fetching range digital_media!38857:39207
2024-03-28 21:07:57 source > Backing off get_values(...) for 2.1s (googleapiclient.errors.HttpError: <httperror>)
2024-03-28 21:07:57 source > Increasing number of records fetching due to rate limits. Current value: 360
2024-03-28 21:07:59 source > Fetching range digital_media!38857:39217
2024-03-28 21:07:59 source > Backing off get_values(...) for 5.9s (googleapiclient.errors.HttpError: <httperror>)
2024-03-28 21:07:59 source > Increasing number of records fetching due to rate limits. Current value: 370
2024-03-28 21:08:05 source > Fetching range digital_media!38857:39227
2024-03-28 21:08:05 source > Backing off get_values(...) for 5.4s (googleapiclient.errors.HttpError: <httperror>)
2024-03-28 21:08:05 source > Increasing number of records fetching due to rate limits. Current value: 380
2024-03-28 21:08:11 source > Fetching range digital_media!38857:39237
2024-03-28 21:08:11 source > Backing off get_values(...) for 2.0s (googleapiclient.errors.HttpError: <httperror>)
2024-03-28 21:08:11 source > Increasing number of records fetching due to rate limits. Current value: 390
2024-03-28 21:08:13 source > Fetching range digital_media!38857:39247
2024-03-28 21:08:13 source > Backing off get_values(...) for 100.8s (googleapiclient.errors.HttpError: <httperror>)
2024-03-28 21:08:13 source > Increasing number of records fetching due to rate limits. Current value: 400
2024-03-28 21:09:54 source > Fetching range digital_media!38857:39257
2024-03-28 21:09:54 source > Giving up get_values(...) after 9 tries (googleapiclient.errors.HttpError: <httperror>)
2024-03-28 21:09:54 source > Stopped syncing process due to rate limits. Rate limit has been reached. Please try later or request a higher quota for your account.
2024-03-28 21:09:54 source > Finished syncing spreadsheet {spreadsheet_id}

Contribute

henriquemeloo commented 5 months ago

Here: https://github.com/airbytehq/airbyte/blob/5ea2b93c38ee8aaf5e4a161a82b4d6ee5513ac1c/airbyte-integrations/connectors/source-google-sheets/source_google_sheets/source.py#L227 we can see that this exception is logged and not re-raised, as it maybe should be, so that the sync would exit as failed.

natikgadzhi commented 5 months ago

We're rather actively looking into cases like this, and have a specific new error type for these — it definitely should not be just swallowed. Thank you for raising this!

natikgadzhi commented 5 months ago

/cc @maxi297 and @ChristoGrab, and @brianjlai — perhaps google sheets could be a good candidate for a cleanup once we release error handlers in low-code CDK and Python CDK sane defaults.

marcosmarxm commented 5 months ago

@natikgadzhi maybe https://github.com/airbytehq/airbyte/pull/35404 could help?

henriquemeloo commented 5 months ago

@natikgadzhi maybe #35404 could help?

It could help, but does not guarantee we won't still be swallowing these errors when they happen, which is, in my opinion, a critical problem, since this is leading to (very) incomplete syncs showing up as successful.

natikgadzhi commented 5 months ago

Yeah, if we're silently giving up but not raising the transient error to the platform, it's bad. We just introduced transient (rate limit) errors in the protocol, and @maxi297 and @katmarkham's team are looking in implementing them and cleaning up the highest usage connectors, which this is one of.

So — we will get to this rather quickly. @henriquemeloo, if we gave you an example of a fix in another connector, would you be interested in trying to put together a PR?

marcosmarxm commented 5 months ago

@henriquemeloo can you try latest version of Google Sheets connector. It has the batch size, maybe you can increase the value to not hit the 429 error

henriquemeloo commented 5 months ago

@natikgadzhi yes. I could try to implement it. I'd appreciate if you could send the example.

natikgadzhi commented 4 months ago

@strosek, @ChristoGrab, do you happen to have an example of error improvements work that you're doing now to highlight how to exit with non-zero exit code and send an error message for transient errors?

henriquemeloo commented 3 months ago

@natikgadzhi @strosek @ChristoGrab do you have an example of how this is being handled so I could try to contribute with this fix?