googleads / googleads-python-lib

The Python client library for Google's Ads APIs
Apache License 2.0
681 stars 974 forks source link

ProxyError: 407: Authentification required #540

Closed hein3r closed 6 months ago

hein3r commented 6 months ago

Hi guys,

I am sitting behind a company firewall and have to use our proxy for any requests.

In other projects we relied on the requests package. Everything works fine here:

import requests
proxies = {
    'http': 'http://[DOMAIN]%5C[USER]:[PWD]@[HOST]:8080'
} 
 r = requests.get("http://www.google.com", proxies=proxies)

However, having specified the exact same URL for the proxy in the googleads.yaml I get the following error for any request:

urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='ads.google.com', port=443): Max retries exceeded with url: /apis/ads/publisher/v202402/ActivityService?wsdl (Caused by ProxyError('Unable to connect to proxy', OSError('Tunnel connection failed: 407 authenticationrequired')))

For example: ad_manager_client = ad_manager.AdManagerClient.LoadFromStorage("googleads.yaml") activity_service = ad_manager_client.GetService('ActivityService', version='v202402')

Any suggestions?

Thanks in advance.

msaniscalchi commented 6 months ago

Hello,

I strongly suspect the core issue here is that you're defining the proxy outside of the library. As currently implemented, we use ProxyConfig to pass the proxy details along to Zeep. This can be configured in googleads.yaml, or passed along as an argument for AdManagerClient if you prefer to set it up manually.

Regards, Mark

hein3r commented 6 months ago

Hi Mark,

thanks for the quick reply.

I doubt that this is the case (maybe my hint about requests was misleading, because I do not use this in the current Google Ads project).

I do specify the proxy in the googleads.yaml like so:

proxy_config:
    'http': 'http://[DOMAIN]%5C[USER]:[PWD]@[HOST]:8080'

This config is then used

from googleads import ad_manager
ad_manager_client = ad_manager.AdManagerClient.LoadFromStorage("googleads.yaml")
activity_service = ad_manager_client.GetService('ActivityService', version='v202402')

When I try to get the ActivityService for example, I get the 407.

msaniscalchi commented 6 months ago

Hello,

Thanks for clarifying, I had in fact misinterpreted that original post, sorry about that!

So, I'll be upfront about it–it would be quite difficult for me to troubleshoot your specific proxy issue, but I can say that the fact you received an HTTP 407 response strongly suggests that the library passed along your proxy configuration to Zeep, which then made the request, and did at least communicate with your proxy so that it could return the authentication error. I'd like to at least provide some explanation on why this doesn't appear to be an issue with this library though, so read on if that interests you.

Looking at the underlying implementation, LoadFromStorage eventually extracts a ProxyConfig based on your proxy_config found in googleads.yaml. When you call GetService, that ProxyConfig is then passed to GetServiceClassForLibrary, which uses ZeepServiceProxy to initialize a transport with the ProxyConfig, which is used by Zeep's Client to handle communication. I can confirm that the proxy configuration is passed to Zeep and should be applied to its requests, so as far as this library is concerned it appears to be working as intended.

For some additional context, the transport created above is a _ZeepProxyTransport, which applies the proxies to the Session used by the Zeep Client to make requests. As an example, for the GetService call, Zeep eventually calls the transport's _load_remote_data, where the following is used to retrieve the WSDL:

        response = self.session.get(url, timeout=self.load_timeout)

Once again, I can confirm that the proxy configuration is applied to the session used to make the call, so it appears to be working as intended.

hein3r commented 6 months ago

Hi Marc, thank you for this extensive explanation. With that I was able to resolve the issue. Figuring out that request does in fact handle Zeep's transport, I worked up a minimal example for a proxied session. As it turned out, request was ignoring the explicitly configured proxy in favor of a environment varible that was still set and I didn't know about. Removing that env var (either by session.trust_env=False or manually) the proxy config was then recognized correctly. Thanks again and best regards

msaniscalchi commented 6 months ago

Great, happy to hear you were able to figure it out!