airbytehq / airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
https://airbyte.com
Other
16.28k stars 4.15k forks source link

[Low code] Can not config Error Handler for Backoff Strategy #42928

Open Danni2901 opened 3 months ago

Danni2901 commented 3 months ago

Platform Version

0.63.9

What step the error happened?

During the Sync

Relevant information

Configuring the stream Backoff Strategy in a custom source is not working (0.63.9), it always uses an exponential backoff, although it was functioning in version 0.61.0

Relevant log output

2024-08-01 08:14:42 platform > Stream status TRACE received of status: RUNNING for stream OrderItems
2024-08-01 08:14:42 source > Backing off _send(...) for 2.0s (airbyte_cdk.sources.streams.http.exceptions.RateLimitBackoffException: Too many requests.)
2024-08-01 08:14:42 source > Caught retryable error 'Too many requests.' after 2 tries. Waiting 2 seconds then retrying...
2024-08-01 08:14:44 platform > Stream status TRACE received of status: RUNNING for stream OrderItems
2024-08-01 08:14:44 source > Backing off _send(...) for 4.0s (airbyte_cdk.sources.streams.http.exceptions.RateLimitBackoffException: Too many requests.)
2024-08-01 08:14:44 source > Caught retryable error 'Too many requests.' after 3 tries. Waiting 4 seconds then retrying...
2024-08-01 08:14:48 source > Backing off _send(...) for 1.0s (airbyte_cdk.sources.streams.http.exceptions.RateLimitBackoffException: Too many requests.)
2024-08-01 08:14:48 source > Caught retryable error 'Too many requests.' after 1 tries. Waiting 1 seconds then retrying...
2024-08-01 08:14:48 platform > Stream status TRACE received of status: RUNNING for stream OrderItems
marcosmarxm commented 3 months ago

Can you share the more information like how are you configuring the backoff and the stream?

Danni2901 commented 3 months ago

I have retrieved the configuration from the UI.

type: CompositeErrorHandler
error_handlers:
  - type: DefaultErrorHandler
    backoff_strategies:
      - type: ConstantBackoffStrategy
        backoff_time_in_seconds: 60
mmostr123 commented 3 months ago

Regarding our company's usage: We've tested all options including backoff, constant, and reading values from headers, but none of these seem to override the default backoff behavior. Platform version: 0.63.0

gavin-ob commented 2 weeks ago

I am having the same issue, even on verison 1.1.0, it happened on a previous upgrade but I am not sure when as it only became evident to me when a stream eventually failed and I looked into it, pretty sure it was around the version mentioned above.

Whatever I set for back off strategy, it just goes 1,2,4,8 seconds. The connectors in question had been using wait-time-from-header, successfully fetching the value in seconds from retry-after header response, now none of them are using it.

Nothing has changed in the API response, not sure how this can be? WHat information can I provide to help?

@mmostr123 did you manage to resolve this?

Here is how it is configured:

Image

Here is an example response:

Image

and here is what I see in the pod logs:

Image

miguelduarte18 commented 1 week ago

Facing the same issue. I wanted to implement a Backoff strategy for override the default handling of 429s but the default behaviour persists.