googleapis / google-auth-library-python

Google Auth Python Library
https://googleapis.dev/python/google-auth/latest/
Apache License 2.0
773 stars 302 forks source link

Impersonated credentials id_token creation to access a service behind IAP #810

Closed vanpelt closed 3 years ago

vanpelt commented 3 years ago

I've also opened a google support request but figured this would be helpful to other users searching for resolution online.

Environment details

Steps to reproduce

I'm writing a python library that needs to access a service behind the Identity Aware Proxy. I'm hoping to simplify user setup by relying on application default credentials to create impersonated credentials from a service account that has access to the proxy. My code looks like:

import google.auth
from google.auth.transport import requests
from google.auth import impersonated_credentials

iap_client_id = "XXXX.apps.googleusercontent.com"
service_account_id = "XXXX@XXXX.iam.gserviceaccount.com"

creds = google.auth.default()
target = impersonated_credentials.Credentials(source_credentials=creds[0], target_principal=service_account_id, target_scopes=["https://www.googleapis.com/auth/cloud-platform"])
id_creds = impersonated_credentials.IDTokenCredentials(target, target_audience=iap_client_id, include_email=True)
authed_session = requests.AuthorizedSession(id_creds) 
res = authed_session.get("https://MY_WEBSERVICE", headers={"Accept": "application/json"})
# res.status_code => 401
# res.content => 'Invalid IAP credentials: empty token'

I Googled "Invalid IAP credentials: empty token" and haven't found anything helpful. I did find this issue from the Kubeflow project but doesn't get me any closer to resolving this issue.

It's not clear to me what target_scopes should be used when impersonating a service account to create an id_token. I've tried "openid" and "email" along with the cloud-platform scope in the example above. I've verified the id_token is getting issued and has an "email" in its claim. When parsing the claims of the generated id_token, it looks like it has everything IAP would need to authenticate:

{
  "aud": "XXXX.apps.googleusercontent.com",
  "azp": "116974953377170881406",
  "email": "XXXX@XXXX.iam.gserviceaccount.com",
  "email_verified": true,
  "exp": 1626813908,
  "iat": 1626810308,
  "iss": "https://accounts.google.com",
  "sub": "XXXX"
}
busunkim96 commented 3 years ago

Tentatively triaging as type: question, please adjust as appropriate when you get a chance to review @arithmetic1728.

arithmetic1728 commented 3 years ago

@busunkim96 Thanks. I will take a look.

piyushnigam commented 3 years ago

Are you sure you are sending it to the IAP endpoint? Can you post the HTTP request headers?

vanpelt commented 3 years ago

Yes I'm sure I'm sending the request to the IAP endpoint. It's clear the IAP endpoint is getting the request as it literally returns a 401 response code with a response body of: "Invalid IAP credentials: empty token".

I'm using the AuthorizedSession helper from this library which is setting the headers. I monkey patched this library to print the headers for the request it is making and they are exactly what one would expect:

{'Accept': 'application/json', 'authorization': 'Bearer XXXXXXXXXXXXXXXX.YYYYYYYYYYYYYYYYYY.ZZZZZZZZZZZZZZZ'}

When I decode the JWT in the Authorization header it's claims look like what I've already described in this ticket.

vanpelt commented 3 years ago

I'm still stuck on this issue. I have been able to do further debugging and it's only gotten more mysterious. I'm able to generate ID_TOKEN credentials using the gcloud command that work with CURL but do not work with the python requests library:

⇒  export ID_TOKEN=`gcloud auth print-identity-token --audiences=27586067691-u7dmkqvjr353fh4mo35pfnubotlfs78m.apps.googleusercontent.com --impersonate-service-account latest-app-wandb-dev-264@playground-111.iam.gserviceaccount.com --include-email`
WARNING: This command is using service account impersonation. All API calls will be executed as [latest-app-wandb-dev-264@playground-111.iam.gserviceaccount.com].
vanpelt@cvp:~/Development⚡ 
⇒  curl -H "Authorization: Bearer $ID_TOKEN" -H "Accept: application/json" https://latest.app.wandb.dev/healthz
ready!%                                                                                                                                              
vanpelt@cvp:~/Development⚡ 
⇒  ipython
Python 3.7.5 (default, Dec  2 2019, 13:23:22) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.8.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: import requests 
     ...: import os                                                                                                                                    
In [2]: ID_TOKEN = os.environ["ID_TOKEN"]                                                                                                            
In [3]: r = requests.get("https://latest.app.wandb.dev/healhz", headers={"Authorization": "Bearer {}".format(ID_TOKEN), "Accept": "application/json"}
In [4]: r.content                                                                                                                                    
Out[4]: b'Invalid IAP credentials: empty token'
In [5]: r = !curl -s -H "Authorization: Bearer $ID_TOKEN" -H "Accept: application/json" https://latest.app.wandb.dev/healthz                         
In [6]: print(r)                                                                                                                                     
['ready!']

It seems like IAP is rejecting the python requests based on headers or other metadata that curl is sending...

vanpelt commented 3 years ago

I've discovered the root cause! We write basic auth information to ~/.netrc by default. The requests library will use this instead of the provided Authorization header 🙃 .

The solution is to either set trust_env = False on a requests.Session() object, or make google-auth-library-python use Proxy-Authorization instead of the regular Authorization header.