The googleads SDK client does pass a timeout of 60 minutes to the requests it makes directly through zeep, but through testing, it turns out that the ReportDownloader that we use to stream responses opens a raw socket without passing a timeout.
This is suspiciously near the potential cause of an intermittent zombie job when the server's connection snaps, but the tap's connection does not, which is something that the requests library documents for its usage with urllib.
Since this call is not configurable as seen in the below link, this PR monkeypatches urllib so that it gets a default timeout of 300 seconds on the socket that is opened.
Ran through a real connection to identify the call being made to urllib
Ran through it with the patch to ensure it applied
Ran through without breakpoints to ensure data is still being emitted as expected
Risks
Because it applies to all calls to open() used in the process, this could theoretically cause side effects, but that is likely a low chance of disastrous impact.
Description of change
The googleads SDK client does pass a timeout of 60 minutes to the requests it makes directly through zeep, but through testing, it turns out that the ReportDownloader that we use to stream responses opens a raw socket without passing a timeout.
This is suspiciously near the potential cause of an intermittent zombie job when the server's connection snaps, but the tap's connection does not, which is something that the
requests
library documents for its usage with urllib.Since this call is not configurable as seen in the below link, this PR monkeypatches urllib so that it gets a default timeout of 300 seconds on the socket that is opened.
Googleads usage of open: https://github.com/googleads/googleads-python-lib/blob/17.0.0/googleads/adwords.py#L1690
Manual QA steps
Risks
open()
used in the process, this could theoretically cause side effects, but that is likely a low chance of disastrous impact.Rollback steps