aws / aws-sdk-java

The official AWS SDK for Java 1.x. The AWS SDK for Java 2.x is available here: https://github.com/aws/aws-sdk-java-v2/
https://aws.amazon.com/sdkforjava
Apache License 2.0
4.12k stars 2.83k forks source link

"Access key or secret key are null" while using InstanceProfileCredentialsProvider.getCredentials() #3117

Closed ChrisCollinsIBM closed 2 months ago

ChrisCollinsIBM commented 4 months ago

Upcoming End-of-Support

Describe the bug

We have a customer that when using EC2 instance credentials via com.amazonaws.auth.InstanceProfileCredentialsProvider.getCredentials() we're getting an error back Unable to load credentials. Access key or secret key are null

No errors are coming back from the EC2ResourceFetcher but no temporary keys are being retrieved.

We verified via CURL that IMDSv2 is enabled, and the basic curl to get a token and get temporary credentials from /latest/meta-data/identity-credentials/ec2/security-credentials/ec2-instance does return credentials but the above workflow that tries to get credentials based on the named role returned from latest/meta-data/iam/security-credentials/ returns the error shown below in Current Behaviour

Expected Behavior

Temporary credentials should be retrieved via the IMDSv2 endpoint, and if they cannot be a more useful error should be presented around permissions or some other issue.

Current Behavior

2024/06/04 12:57:40:487 UTC [DEBUG] BaseCredentialsFetcher - Retrieving credentials. 
2024/06/04 12:57:40:496 UTC [DEBUG] EC2ResourceFetcher - Executing PUT http://169.254.169.254/latest/api/token with headers [User-Agent, Connection, Accept, x-aws-ec2-metadata-token-ttl-seconds] 2024/06/04 12:57:40:519 UTC [DEBUG] EC2ResourceFetcher - Got response code 200 from PUT http://169.254.169.254/latest/api/token 
2024/06/04 12:57:40:519 UTC [DEBUG] EC2ResourceFetcher - Completed PUT http://169.254.169.254/latest/api/token after 23ms 
2024/06/04 12:57:40:520 UTC [DEBUG] EC2ResourceFetcher - Executing GET http://169.254.169.254/latest/meta-data/iam/security-credentials/ with headers [User-Agent, Connection, x-aws-ec2-metadata-token, Accept] 
2024/06/04 12:57:40:521 UTC [DEBUG] EC2ResourceFetcher - Got response code 200 from GET http://169.254.169.254/latest/meta-data/iam/security-credentials/ 
2024/06/04 12:57:40:521 UTC [DEBUG] EC2ResourceFetcher - Completed GET http://169.254.169.254/latest/meta-data/iam/security-credentials/ after 1ms 
2024/06/04 12:57:40:521 UTC [DEBUG] EC2ResourceFetcher - Executing PUT http://169.254.169.254/latest/api/token with headers [User-Agent, Connection, Accept, x-aws-ec2-metadata-token-ttl-seconds] 2024/06/04 12:57:40:523 UTC [DEBUG] EC2ResourceFetcher - Got response code 200 from PUT http://169.254.169.254/latest/api/token 
2024/06/04 12:57:40:523 UTC [DEBUG] EC2ResourceFetcher - Completed PUT http://169.254.169.254/latest/api/token after 1ms 
2024/06/04 12:57:40:523 UTC [DEBUG] EC2ResourceFetcher - Executing GET http://169.254.169.254/latest/meta-data/iam/security-credentials/example-ec2-instance-role with headers [User-Agent, Connection,x-aws-ec2-metadata-token, Accept] 
2024/06/04 12:57:40:525 UTC [DEBUG] EC2ResourceFetcher - Got response code 200 from GET http://169.254.169.254/latest/meta-data/iam/security-credentials/example-ec2-instance-role
2024/06/04 12:57:40:525 UTC [DEBUG] EC2ResourceFetcher - Completed GET http://169.254.169.254/latest/meta-data/iam/security-credentials/example-ec2-instance-role 2ms 

Exception in thread "main" com.amazonaws.SdkClientException: Unable to load credentials. Access key or secret key are null.
      at com.amazonaws.auth.BaseCredentialsFetcher.fetchCredentials(BaseCredentialsFetcher.java:155)
      at com.amazonaws.auth.BaseCredentialsFetcher.getCredentials(BaseCredentialsFetcher.java:89)
      at com.amazonaws.auth.InstanceProfileCredentialsProvider.getCredentials(InstanceProfileCredentialsProvider.java:174)
      at InstanceProfileCredentialsTest.main(InstanceProfileCredentialsTest.java:11) 

This output is the result of the following java code:

import com.amazonaws.auth.AWSCredentials;
import com.amazonaws.auth.AWSCredentialsProvider;
import com.amazonaws.auth.InstanceProfileCredentialsProvider;

public class InstanceProfileCredentialsTest
{
    public static void main(String[] arg)
    {
        AWSCredentialsProvider credentialsProvider = InstanceProfileCredentialsProvider.getInstance();

        AWSCredentials credentials = credentialsProvider.getCredentials();

        System.out.println("Access Key : " + credentials.getAWSAccessKeyId());
        System.out.println("Secret Key : " + credentials.getAWSSecretKey());
    }
}

This was run using

This same code run on another EC2 instance with the same jars works fine.

Reproduction Steps

Outline in previous step

import com.amazonaws.auth.AWSCredentials;
import com.amazonaws.auth.AWSCredentialsProvider;
import com.amazonaws.auth.InstanceProfileCredentialsProvider;

public class InstanceProfileCredentialsTest
{
    public static void main(String[] arg)
    {
        AWSCredentialsProvider credentialsProvider = InstanceProfileCredentialsProvider.getInstance();

        AWSCredentials credentials = credentialsProvider.getCredentials();

        System.out.println("Access Key : " + credentials.getAWSAccessKeyId());
        System.out.println("Secret Key : " + credentials.getAWSSecretKey());
    }
}

Wire logs are not available as this portion of the SDK uses HTTPUrlConnection instead of HTTPClient so we cannot get the response in the logging without modifying the classes and re-building.

Possible Solution

No solution in mind, additional logging would be helpful to see the raw response.

Additional Information/Context

Trying to understand why this isn't working but also no relevant errors are seen. I don't believe calls to IMDS endpoints are logged in CloudTrail so I don't know if any logging would be there that the customer would be able to look at.

The system and AWS account are at arms length from us and we're restricted in troubleshooting, but we're unable to re-create this issue in our environment at all but have the output to show it is happening.

Any assistance is greatly appreciated.

AWS Java SDK version used

1.12.675

JDK version used

1.8.0_401

Operating System and version

Red Hat Enterprise Linux Server release 7.9 (Maipo)

ChrisCollinsIBM commented 3 months ago

Any ETA on getting assistance with this or getting it triaged?

ChrisCollinsIBM commented 3 months ago

Still looking for some assistance with this, thanks!

debora-ito commented 3 months ago

@ChrisCollinsIBM looking at the logs provided in the Current Behavior, there's two separate calls to generate a session TOKEN, both are successful (don't know why there's two calls, though). I'm guessing this is the call to obtain the credentials associated with the instance role "example-ec2-instance-role":

2024/06/04 12:57:40:525 UTC [DEBUG] EC2ResourceFetcher - Completed GET http://169.254.169.254/latest/meta-
data/iam/security-credentials/example-ec2-instance-role 2ms 

which is also successful.

I'm not quite clear in what moment the error "Unable to load credentials" is thrown, in the logs the stacktrace seems to be cut out from the timeline. Is it immediately after the last [DEBUG] line?

ChrisCollinsIBM commented 3 months ago

Yes @debora-ito, that exception is thrown right after, there's no try/catch block logging the exception so you're not seeing a timestamp as the exception is just thrown, and it's the result of the code above logged in the issue but I'll repeat it here:

import com.amazonaws.auth.AWSCredentials;
import com.amazonaws.auth.AWSCredentialsProvider;
import com.amazonaws.auth.InstanceProfileCredentialsProvider;

public class InstanceProfileCredentialsTest
{
    public static void main(String[] arg)
    {
        AWSCredentialsProvider credentialsProvider = InstanceProfileCredentialsProvider.getInstance();

        AWSCredentials credentials = credentialsProvider.getCredentials();

        System.out.println("Access Key : " + credentials.getAWSAccessKeyId());
        System.out.println("Secret Key : " + credentials.getAWSSecretKey());
    }
}

As you mention I believe there are 2 calls because call one is to http://169.254.169.254/latest/meta-data/iam/security-credentials/ and then call two is to http://169.254.169.254/latest/meta-data/iam/security-credentials/example-ec2-instance-role.

example-ec2-instance-role is sanitized from the original logs but does match the role name assigned to the EC2 instance role.

This isn't happening in our environment but on a customer, and I did have them run a shell script to run a CURL to hit the security-credentials endpoint and get the temporary credentials there (which did work), but I didn't have the script do the next level and make the second call to the role specific path.

ChrisCollinsIBM commented 2 months ago

The customer got the issue resolved, but we unfortunately didn't get any information on what fixed it.

I'll close this issue since there's nothing to actually fix or reproduce at this point, but it'll remain searchable If someone else hits it.

github-actions[bot] commented 2 months ago

This issue is now closed.

Comments on closed issues are hard for our team to see. If you need more assistance, please open a new issue that references this one.