googleapis / google-auth-library-python

Google Auth Python Library
https://googleapis.dev/python/google-auth/latest/
Apache License 2.0
784 stars 308 forks source link

Unable to retrieve AWS role name #1364

Open ahlag opened 1 year ago

ahlag commented 1 year ago

Thanks for stopping by to let us know something could be better!

PLEASE READ: If you have a support contract with Google, please create an issue in the support console instead of filing on GitHub. This will ensure a timely response.

Please run down the following list and make sure you've tried the usual "quick fixes":

If you are still having issues, please be sure to include as much information as possible:

Environment details

Steps to reproduce

  1. Run the following code after setting up the Workload Identity Federation by following GCP Doc and Youtube. Replicated all the steps.
    
    import google.auth
    import os
    from google.cloud import storage

os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = "./XX.json"

os.environ['GOOGLE_CLOUD_PROJECT'] = "XXXX"

client = storage.Client() buckets = client.list_buckets() for bucket in buckets: print(bucket.name)



Making sure to follow these steps will guarantee the quickest resolution possible.

Thanks!
ahlag commented 1 year ago

Error Log

Traceback (most recent call last):
  File "/home/ec2-user/sample.py", line 8, in <module>
    client = storage.Client()
  File "/home/ec2-user/.local/lib/python3.9/site-packages/google/cloud/storage/client.py", line 173, in __init__
    super(Client, self).__init__(
  File "/home/ec2-user/.local/lib/python3.9/site-packages/google/cloud/client/__init__.py", line 321, in __init__
    Client.__init__(
  File "/home/ec2-user/.local/lib/python3.9/site-packages/google/cloud/client/__init__.py", line 178, in __init__
    credentials, _ = google.auth.default(scopes=scopes)
  File "/home/ec2-user/.local/lib/python3.9/site-packages/google/auth/_default.py", line 675, in default
    project_id = credentials.get_project_id(request=request)
  File "/home/ec2-user/.local/lib/python3.9/site-packages/google/auth/external_account.py", line 342, in get_project_id
    self.before_request(request, "GET", url, headers)
  File "/home/ec2-user/.local/lib/python3.9/site-packages/google/auth/credentials.py", line 156, in before_request
    self.refresh(request)
  File "/home/ec2-user/.local/lib/python3.9/site-packages/google/auth/external_account.py", line 363, in refresh
    self._impersonated_credentials.refresh(request)
  File "/home/ec2-user/.local/lib/python3.9/site-packages/google/auth/impersonated_credentials.py", line 247, in refresh
    self._update_token(request)
  File "/home/ec2-user/.local/lib/python3.9/site-packages/google/auth/impersonated_credentials.py", line 260, in _update_token
    self._source_credentials.refresh(request)
  File "/home/ec2-user/.local/lib/python3.9/site-packages/google/auth/external_account.py", line 381, in refresh
    subject_token=self.retrieve_subject_token(request),
  File "/home/ec2-user/.local/lib/python3.9/site-packages/google/auth/aws.py", line 482, in retrieve_subject_token
    aws_security_credentials = self._get_security_credentials(
  File "/home/ec2-user/.local/lib/python3.9/site-packages/google/auth/aws.py", line 618, in _get_security_credentials
    role_name = self._get_metadata_role_name(request, imdsv2_session_token)
  File "/home/ec2-user/.local/lib/python3.9/site-packages/google/auth/aws.py", line 720, in _get_metadata_role_name
    raise exceptions.RefreshError(
google.auth.exceptions.RefreshError: ('Unable to retrieve AWS role name', '<?xml version="1.0" encoding="iso-8859-1"?>\n<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"\n\t"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">\n<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">\n <head>\n  <title>401 - Unauthorized</title>\n </head>\n <body>\n  <h1>401 - Unauthorized</h1>\n </body>\n</html>\n')
sassmith commented 1 year ago

hey @ahlag ! I was following the same guides and running into the same error. I will share how I fixed this, hopefully it helps you too.

The google.auth library is trying to use the AWS Instance Metadata Service (IMDS) of your EC2 instance to grab the aws_role_name, aws_region, etc.

There are 2 versions of IMDS, v1 is a request/response method and v2 is a session-oriented method, Google.auth supports both versions. To get the AWS region it is making a GET request to http://169.254.169.254/latest/meta-data/placement/availability-zone (the AWS metadata server) and will add the IMDSv2 session token to the headers of the request if present.

For me, the IMDSv2 session token was not being included in the request, making it a IMDSv1 request. However my EC2 instance was requiring IMDSv2, producing the same 401 Unauthorized message you got. Changing my EC2 instance to IMDSv2 optional fixed this issue.

Steps to fix:

Screenshot 2023-08-22 at 4 43 37 PM Screenshot 2023-08-22 at 3 31 54 PM

If your organization requires IMDSv2 be required then I unfortunately don't know the steps needed to get around that but hopefully this gets you a little bit closer!

Long story short: I think this is an issue with config on the AWS side which most of these Workload Identity Federation guides brush over and not actually a bug with the Google.auth library.

ahlag commented 1 year ago

Hey @sassmith, I got it working with the same steps too! Thank you!

zchenyu commented 1 year ago

Ran into this as well, thanks @sassmith for the workaround.

I also found a way to make it work with IMDSv2. Just append "imdsv2_session_token_url": "http://169.254.169.254/latest/api/token" to your GOOGLE_APPLICATION_CREDENTIALS file

e.g.

{
  "type": "external_account",
  "audience": "//iam.googleapis.com/projects/xxx/locations/global/workloadIdentityPools/xxx/providers/xxx",
  "subject_token_type": "urn:ietf:params:aws:token-type:aws4_request",
  "service_account_impersonation_url": "xxx",
  "token_url": "https://sts.googleapis.com/v1/token",
  "credential_source": {
    "environment_id": "aws1",
    "region_url": "http://169.254.169.254/latest/meta-data/placement/availability-zone",
    "url": "http://169.254.169.254/latest/meta-data/iam/security-credentials",
    "regional_cred_verification_url": "https://sts.{region}.amazonaws.com?Action=GetCallerIdentity&Version=2011-06-15",
    "imdsv2_session_token_url": "http://169.254.169.254/latest/api/token"
  }
}

The Node.js docs mention this in passing: https://cloud.google.com/nodejs/docs/reference/google-auth-library/latest

The Python implementation is here: https://github.com/googleapis/google-auth-library-python/blob/9c87ad07c6618bc5b1be3b254fdf5211e7778061/google/auth/aws.py#L450-L469

tiolumbantobing commented 5 months ago

Hi @zchenyu, @sassmith , @ahlag i try to add imdsv2_session_token_url on .json file but i get a different error. by the way i run script ipynb from Sagemaker Instances Amazon Linux 2, Jupyter Lab 3 (notebook-al2-v2) | Minimum IMDS Version 2

script

import os
from google.cloud import bigquery

os.environ["GOOGLE_CLOUD_PROJECT"] ='xxxx'
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] ='xxx.json'
client = bigquery.Client()

query = """
SELECT name
FROM `xxx.xxx.city`
ORDER BY name ASC LIMIT 5
"""

query_job = client.query(query) # API request.
for row in query_job:
    print(f"{row['name']}")

Error

RefreshError: Unable to retrieve AWS security credentials: <html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/>
<title>Error 415 Unsupported Media Type</title>
</head>
<body><h2>HTTP ERROR 415 Unsupported Media Type</h2>
<table>
<tr><th>URI:</th><td>/latest/meta-data/iam/security-credentials/BaseNotebookInstanceEc2InstanceRole</td></tr>
<tr><th>STATUS:</th><td>415</td></tr>
<tr><th>MESSAGE:</th><td>Unsupported Media Type</td></tr>
<tr><th>SERVLET:</th><td>RestEc2MetaDataProxyApplicationServlet</td></tr>
</table>
<hr/><a href="https://eclipse.org/jetty">Powered by Jetty:// 9.4.53.v20231009</a><hr/>

</body>
</html>

is there a clue/resolved to this? I've searched but it's a bit strange with HTTP ERROR 415 Unsupported Media Type

jchapian commented 1 month ago

@tiolumbantobing -- For whatever reason this happens on Sagemaker notebooks. Removing this line seems to resolve the issue. It's unclear that the request header is needed at all.