airbytehq / airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
https://airbyte.com
Other
16.04k stars 4.11k forks source link

Source Twilio: 500 Error when syncing Usage Records Stream #23512

Closed jitoquinto closed 4 months ago

jitoquinto commented 1 year ago
## Environment - **Airbyte version**: 0.40.4 - **OS Version / Instance**: GCP VM - **Deployment**: Docker - **Source Connector and version**: Twilio 0.1.15 - **Destination Connector and version**: BigQuery 1.1.16 - **Step where error happened**: Sync job ## Current Behavior When Syncing the Usage Records Stream, we keep getting the following API error response ``` airbyte_cdk.sources.streams.http.exceptions.DefaultBackoffException: Request URL: https://api.twilio.com/2010-04-01/Accounts/****/Usage/Records.json?PageSize=1000&StartDate=2023-01-02&EndDate=2023-02-27, Response Code: 500, Response Text: {"code": 20500, "message": "An internal server error has occurred", "more_info": "https://www.twilio.com/docs/errors/20500", "status": 500} ``` I tried hitting this API endpoint with Curl and it seems that by changing the PageSize to a smaller value (e.g. 100) the API call succeeds. Im not too sure if this is an issue on the Twilio side but it seems like reducing the number of responses the call fetches at a time helps out. ## Expected Behavior Usage Records Stream completes successfully. ## Logs
2023-02-27 16:00:30 destination > 2023-02-27 16:00:30 INFO i.a.i.d.b.BigQueryStagingConsumerFactory(lambda$onStartFunction$3):107 - Preparing tmp tables in destination completed.
2023-02-27 16:00:32 destination > 2023-02-27 16:00:32 INFO i.a.i.d.r.SerializedBufferingStrategy(lambda$addRecord$0):47 - Starting a new buffer for stream usage_records (current state: 0 bytes in 0 buffers)
2023-02-27 16:00:32 destination > 2023-02-27 16:00:32 INFO i.a.i.d.g.u.GcsUtils(getDefaultAvroSchema):25 - Default schema.
2023-02-27 16:00:59 source > Backing off _send(...) for 5.0s (airbyte_cdk.sources.streams.http.exceptions.DefaultBackoffException: Request URL: https://api.twilio.com/2010-04-01/Accounts/****/Usage/Records.json?PageSize=1000&StartDate=2023-01-02&EndDate=2023-02-27, Response Code: 500, Response Text: {"code": 20500, "message": "An internal server error has occurred", "more_info": "https://www.twilio.com/docs/errors/20500", "status": 500})
2023-02-27 16:00:59 source > Caught retryable error 'Request URL: https://api.twilio.com/2010-04-01/Accounts/****/Usage/Records.json?PageSize=1000&StartDate=2023-01-02&EndDate=2023-02-27, Response Code: 500, Response Text: {"code": 20500, "message": "An internal server error has occurred", "more_info": "https://www.twilio.com/docs/errors/20500", "status": 500}' after 1 tries. Waiting 5 seconds then retrying...
2023-02-27 16:01:30 source > Backing off _send(...) for 10.0s (airbyte_cdk.sources.streams.http.exceptions.DefaultBackoffException: Request URL: https://api.twilio.com/2010-04-01/Accounts/****/Usage/Records.json?PageSize=1000&StartDate=2023-01-02&EndDate=2023-02-27, Response Code: 500, Response Text: {"code": 20500, "message": "An internal server error has occurred", "more_info": "https://www.twilio.com/docs/errors/20500", "status": 500})
2023-02-27 16:01:30 source > Caught retryable error 'Request URL: https://api.twilio.com/2010-04-01/Accounts/****/Usage/Records.json?PageSize=1000&StartDate=2023-01-02&EndDate=2023-02-27, Response Code: 500, Response Text: {"code": 20500, "message": "An internal server error has occurred", "more_info": "https://www.twilio.com/docs/errors/20500", "status": 500}' after 2 tries. Waiting 10 seconds then retrying...
2023-02-27 16:02:06 source > Backing off _send(...) for 20.0s (airbyte_cdk.sources.streams.http.exceptions.DefaultBackoffException: Request URL: https://api.twilio.com/2010-04-01/Accounts/****/Usage/Records.json?PageSize=1000&StartDate=2023-01-02&EndDate=2023-02-27, Response Code: 500, Response Text: {"code": 20500, "message": "An internal server error has occurred", "more_info": "https://www.twilio.com/docs/errors/20500", "status": 500})
2023-02-27 16:02:06 source > Caught retryable error 'Request URL: https://api.twilio.com/2010-04-01/Accounts/****/Usage/Records.json?PageSize=1000&StartDate=2023-01-02&EndDate=2023-02-27, Response Code: 500, Response Text: {"code": 20500, "message": "An internal server error has occurred", "more_info": "https://www.twilio.com/docs/errors/20500", "status": 500}' after 3 tries. Waiting 20 seconds then retrying...
2023-02-27 16:02:52 source > Backing off _send(...) for 40.0s (airbyte_cdk.sources.streams.http.exceptions.DefaultBackoffException: Request URL: https://api.twilio.com/2010-04-01/Accounts/****/Usage/Records.json?PageSize=1000&StartDate=2023-01-02&EndDate=2023-02-27, Response Code: 500, Response Text: {"code": 20500, "message": "An internal server error has occurred", "more_info": "https://www.twilio.com/docs/errors/20500", "status": 500})
2023-02-27 16:02:52 source > Caught retryable error 'Request URL: https://api.twilio.com/2010-04-01/Accounts/****/Usage/Records.json?PageSize=1000&StartDate=2023-01-02&EndDate=2023-02-27, Response Code: 500, Response Text: {"code": 20500, "message": "An internal server error has occurred", "more_info": "https://www.twilio.com/docs/errors/20500", "status": 500}' after 4 tries. Waiting 40 seconds then retrying...
2023-02-27 16:03:58 source > Backing off _send(...) for 80.0s (airbyte_cdk.sources.streams.http.exceptions.DefaultBackoffException: Request URL: https://api.twilio.com/2010-04-01/Accounts/****/Usage/Records.json?PageSize=1000&StartDate=2023-01-02&EndDate=2023-02-27, Response Code: 500, Response Text: {"code": 20500, "message": "An internal server error has occurred", "more_info": "https://www.twilio.com/docs/errors/20500", "status": 500})
2023-02-27 16:03:58 source > Caught retryable error 'Request URL: https://api.twilio.com/2010-04-01/Accounts/****/Usage/Records.json?PageSize=1000&StartDate=2023-01-02&EndDate=2023-02-27, Response Code: 500, Response Text: {"code": 20500, "message": "An internal server error has occurred", "more_info": "https://www.twilio.com/docs/errors/20500", "status": 500}' after 5 tries. Waiting 80 seconds then retrying...
2023-02-27 16:05:44 source > Giving up _send(...) after 6 tries (airbyte_cdk.sources.streams.http.exceptions.DefaultBackoffException: Request URL: https://api.twilio.com/2010-04-01/Accounts/****/Usage/Records.json?PageSize=1000&StartDate=2023-01-02&EndDate=2023-02-27, Response Code: 500, Response Text: {"code": 20500, "message": "An internal server error has occurred", "more_info": "https://www.twilio.com/docs/errors/20500", "status": 500})
2023-02-27 16:05:44 source > Encountered an exception while reading stream usage_records
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/abstract_source.py", line 120, in read
    yield from self._read_stream(
  File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/abstract_source.py", line 189, in _read_stream
    for record in record_iterator:
  File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/abstract_source.py", line 305, in _read_full_refresh
    for record_data_or_message in record_data_or_messages:
  File "/airbyte/integration_code/source_twilio/streams.py", line 223, in read_records
    for record in super().read_records(sync_mode, cursor_field, stream_slice, stream_state):
  File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/streams/http/http.py", line 413, in read_records
    yield from self._read_pages(
  File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/streams/http/http.py", line 429, in _read_pages
    request, response = self._fetch_next_page(stream_slice, stream_state, next_page_token)
  File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/streams/http/http.py", line 452, in _fetch_next_page
    response = self._send_request(request, request_kwargs)
  File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/streams/http/http.py", line 354, in _send_request
    return backoff_handler(user_backoff_handler)(request, request_kwargs)
  File "/usr/local/lib/python3.9/site-packages/backoff/_sync.py", line 105, in retry
    ret = target(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/backoff/_sync.py", line 105, in retry
    ret = target(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/streams/http/http.py", line 314, in _send
    raise DefaultBackoffException(request=request, response=response, error_message=error_message)
airbyte_cdk.sources.streams.http.exceptions.DefaultBackoffException: Request URL: https://api.twilio.com/2010-04-01/Accounts/****/Usage/Records.json?PageSize=1000&StartDate=2023-01-02&EndDate=2023-02-27, Response Code: 500, Response Text: {"code": 20500, "message": "An internal server error has occurred", "more_info": "https://www.twilio.com/docs/errors/20500", "status": 500}
2023-02-27 16:05:44 source > Finished syncing usage_records
2023-02-27 16:05:44 source > SourceTwilio runtimes:
Syncing stream usage_records 0:05:17.300131
2023-02-27 16:05:44 source > Request URL: https://api.twilio.com/2010-04-01/Accounts/****/Usage/Records.json?PageSize=1000&StartDate=2023-01-02&EndDate=2023-02-27, Response Code: 500, Response Text: {"code": 20500, "message": "An internal server error has occurred", "more_info": "https://www.twilio.com/docs/errors/20500", "status": 500}
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/abstract_source.py", line 120, in read
    yield from self._read_stream(
  File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/abstract_source.py", line 189, in _read_stream
    for record in record_iterator:
  File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/abstract_source.py", line 305, in _read_full_refresh
    for record_data_or_message in record_data_or_messages:
  File "/airbyte/integration_code/source_twilio/streams.py", line 223, in read_records
    for record in super().read_records(sync_mode, cursor_field, stream_slice, stream_state):
  File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/streams/http/http.py", line 413, in read_records
    yield from self._read_pages(
  File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/streams/http/http.py", line 429, in _read_pages
    request, response = self._fetch_next_page(stream_slice, stream_state, next_page_token)
  File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/streams/http/http.py", line 452, in _fetch_next_page
    response = self._send_request(request, request_kwargs)
  File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/streams/http/http.py", line 354, in _send_request
    return backoff_handler(user_backoff_handler)(request, request_kwargs)
  File "/usr/local/lib/python3.9/site-packages/backoff/_sync.py", line 105, in retry
    ret = target(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/backoff/_sync.py", line 105, in retry
    ret = target(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/streams/http/http.py", line 314, in _send
    raise DefaultBackoffException(request=request, response=response, error_message=error_message)
airbyte_cdk.sources.streams.http.exceptions.DefaultBackoffException: Request URL: https://api.twilio.com/2010-04-01/Accounts/****/Usage/Records.json?PageSize=1000&StartDate=2023-01-02&EndDate=2023-02-27, Response Code: 500, Response Text: {"code": 20500, "message": "An internal server error has occurred", "more_info": "https://www.twilio.com/docs/errors/20500", "status": 500}

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/airbyte/integration_code/main.py", line 13, in <module>
    launch(source, sys.argv[1:])
  File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/entrypoint.py", line 131, in launch
    for message in source_entrypoint.run(parsed_args):
  File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/entrypoint.py", line 122, in run
    for message in generator:
  File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/abstract_source.py", line 133, in read
    raise AirbyteTracedException.from_exception(e, message=display_message) from e
airbyte_cdk.utils.traced_exception.AirbyteTracedException: Request URL: https://api.twilio.com/2010-04-01/Accounts/****/Usage/Records.json?PageSize=1000&StartDate=2023-01-02&EndDate=2023-02-27, Response Code: 500, Response Text: {"code": 20500, "message": "An internal server error has occurred", "more_info": "https://www.twilio.com/docs/errors/20500", "status": 500}

Steps to Reproduce

  1. Enable the Usage Records Stream with a high number of records to be synced.

Are you willing to submit a PR?

Yes! I have made a change to the Page Size of the Usage Records stream on our fork and have been using that image to sync Twilio data. I will create a PR with that change.

octavia-squidington-iii commented 5 months ago

At Airbyte, we seek to be clear about the project priorities and roadmap. This issue has not had any activity for 180 days, suggesting that it's not as critical as others. It's possible it has already been fixed. It is being marked as stale and will be closed in 20 days if there is no activity. To keep it open, please comment to let us know why it is important to you and if it is still reproducible on recent versions of Airbyte.

octavia-squidington-iii commented 4 months ago

This issue was closed because it has been inactive for 20 days since being marked as stale.