googleads / google-ads-python

Google Ads API Client Library for Python
Apache License 2.0
524 stars 480 forks source link

google.api_core.exceptions.Unknown: None Stream removed #707

Closed g3or3 closed 2 years ago

g3or3 commented 2 years ago

Workflow runs as part of an Airflow DAG task to pull the latest Google Ads adgroup data and store in S3 for further processing. This task executes hourly and usually succeeds in under 2 minutes. Lately, there have been task instances which retry after waiting for 50+ minutes where the call to Google hangs and the tracebacks looks as follows:

Request made: ClientCustomerId: XXXXXXXXX, Host: googleads.googleapis.com, Method: /google.ads.googleads.v10.services.GoogleAdsService/SearchStream, RequestId: None, IsFault: True, FaultMessage: Stream removed
Traceback (most recent call last):
  File "/code/venvs/venv/lib/python3.8/site-packages/google/api_core/grpc_helpers.py", line 107, in next
    return six.next(self._wrapped)
  File "/code/venvs/venv/lib/python3.8/site-packages/google/ads/googleads/interceptors/response_wrappers.py", line 108, in __next__
    raise e
  File "/code/venvs/venv/lib/python3.8/site-packages/google/ads/googleads/interceptors/response_wrappers.py", line 105, in __next__
    self._failure_handler(self._underlay_call)
  File "/code/venvs/venv/lib/python3.8/site-packages/google/ads/googleads/interceptors/exception_interceptor.py", line 71, in _handle_grpc_failure
    raise self._get_error_from_response(response)
  File "/code/venvs/venv/lib/python3.8/site-packages/google/ads/googleads/interceptors/response_wrappers.py", line 88, in __next__
    message = next(self._underlay_call)
  File "/code/venvs/venv/lib/python3.8/site-packages/grpc/_channel.py", line 426, in __next__
    return self._next()
  File "/code/venvs/venv/lib/python3.8/site-packages/grpc/_channel.py", line 642, in _next
    return self._next_response()
  File "/code/venvs/venv/lib/python3.8/site-packages/grpc/_channel.py", line 617, in _next_response
    raise self
grpc._channel._SingleThreadedRendezvous: <_SingleThreadedRendezvous of RPC that terminated with:
    status = StatusCode.UNKNOWN
    debug_error_string = "{"created":"@1664334164.987534992","description":"Error received from peer ipv4","file":"src/core/lib/surface/call.cc","file_line":905,"grpc_message":"Stream removed","grpc_status":2}"
    details = "Stream removed"

Steps to Reproduce:

client = GoogleAdsClient.load_from_dict(configs)
service = client.get_service('GoogleAdsService')

request = self.client.get_type('SearchGoogleAdsStreamRequest')
request.customer_id = account_id
request.query = f"""
            SELECT
                campaign.id,
                ad_group.id,
                campaign.bidding_strategy_type
            FROM ad_group
            WHERE ad_group.status = 'ENABLED'
                AND campaign.bidding_strategy_type = '{bidding_strategy_type}'
                AND ad_group.id IN {adgroup_ids}
        """

stream = service.search_stream(request)

for batch in stream: <- Exception occurs here
    for row in batch.results:    
        # process and write to s3

Exception:

Traceback (most recent call last):
  File "/etc/airflow/etl/current/lib/lyftdata/google_adwords/google_ads_service.py", line 866, in run_hourly_adgroup_bid_report
    for batch in adgroup_hourly_bids_results_stream:
  File "/code/venvs/venv/lib/python3.8/site-packages/google/api_core/grpc_helpers.py", line 110, in next
    six.raise_from(exceptions.from_grpc_error(exc), exc)
  File "<string>", line 3, in raise_from
google.api_core.exceptions.Unknown: None Stream removed

Expected behavior:

Raise network error if occurred.

Client library version and API version: Client library version: v10 Google Ads API version: 16.0.0

To enable logging see this page: https://developers.google.com/google-ads/api/docs/client-libs/python/logging

NOTE: Make sure to include a Request ID when possible, and to redact personally identifiable information, including developer tokens, GCP client IDs, customer IDs, etc. -->

Anything else we should know about your project / environment:

BenRKarl commented 2 years ago

Hi @g3or3 in your example code, stream is a generator and only processes chunks in the stream on each iteration, so if there is logic in the body of the loop that is causing a delay, it could eventually cause the stream to stop. In other words, is it possible to confirm that the steps defined here as process and write to s3 aren't causing a bottleneck?

If you're processing a large amount of data in each row, setting use_proto_plus to False can help improve performance.

BenRKarl commented 2 years ago

@g3or3 Another thought, we recently had an outage that may be the cause here. It's since been fixed, can you let me know if this problem is still occurring?

g3or3 commented 2 years ago

Thank you for the feedback here, I will check to make sure we don't have any bottlenecks in the loop. Although it seems to be resolved now as well thanks!

wihl commented 2 years ago

Details on the outage can be found at https://ads.google.com/status/publisher/incidents/TzP81sKq9EBjuSGURXSq