awslabs / aws-glue-libs

AWS Glue Libraries are additions and enhancements to Spark for ETL operations.
Other
635 stars 300 forks source link

AWS SSO not supported for CLI profile? #138

Closed kherrera-ebsco closed 2 years ago

kherrera-ebsco commented 2 years ago

We use AWS SSO to authenticate our AWS CLI profiles. When trying submit a job, I receive the following error:

py4j.protocol.Py4JJavaError: An error occurred while calling o39.getCatalogSource.
: com.amazonaws.SdkClientException: Unable to load AWS credentials from any provider in the chain: [EnvironmentVariableCredentialsProvider: Unable to load AWS credentials from environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY)), SystemPropertiesCredentialsProvider: Unable to load AWS credentials from Java system properties (aws.accessKeyId and aws.secretKey), WebIdentityTokenCredentialsProvider: You must specify a value for roleArn and roleSessionName, com.amazonaws.auth.profile.ProfileCredentialsProvider@c0c09ba: Unable to load credentials into profile [medi-devqa-developers]: AWS Access Key ID is not specified., com.amazonaws.auth.EC2ContainerCredentialsProviderWrapper@10e7171d: Failed to connect to service endpoint: ]
        at com.amazonaws.auth.AWSCredentialsProviderChain.getCredentials(AWSCredentialsProviderChain.java:136)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.getCredentialsFromContext(AmazonHttpClient.java:1257)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.runBeforeRequestHandlers(AmazonHttpClient.java:833)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:783)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:770)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:744)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:704)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:686)
        at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:550)
        at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:530)
        at com.amazonaws.services.glue.AWSGlueClient.doInvoke(AWSGlueClient.java:11244)
        at com.amazonaws.services.glue.AWSGlueClient.invoke(AWSGlueClient.java:11211)
        at com.amazonaws.services.glue.AWSGlueClient.invoke(AWSGlueClient.java:11200)
        at com.amazonaws.services.glue.AWSGlueClient.executeGetTable(AWSGlueClient.java:6691)
        at com.amazonaws.services.glue.AWSGlueClient.getTable(AWSGlueClient.java:6660)
        at com.amazonaws.services.glue.util.DataCatalogWrapper.$anonfun$getTable$2(DataCatalogWrapper.scala:163)
        at com.amazonaws.services.glue.util.ErieRetryWrapper$.$anonfun$executeWithRetry$1(DataCatalogWrapper.scala:974)
        at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
        at scala.util.Try$.apply(Try.scala:213)
        at com.amazonaws.services.glue.util.ErieRetryWrapper$.executeWithRetry(DataCatalogWrapper.scala:974)
        at com.amazonaws.services.glue.util.DataCatalogWrapper.$anonfun$getTable$1(DataCatalogWrapper.scala:162)
        at scala.util.Try$.apply(Try.scala:213)
        at com.amazonaws.services.glue.util.DataCatalogWrapper.getTable(DataCatalogWrapper.scala:140)
        at com.amazonaws.services.glue.GlueContext.getCatalogSource(GlueContext.scala:199)
        at com.amazonaws.services.glue.GlueContext.getCatalogSource(GlueContext.scala:181)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
        at py4j.Gateway.invoke(Gateway.java:282)
        at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
        at py4j.commands.CallCommand.execute(CallCommand.java:79)
        at py4j.GatewayConnection.run(GatewayConnection.java:238)
        at java.lang.Thread.run(Thread.java:750)

When authenticated via SSO, there is no ~/.aws/credentials file.

moomindani commented 2 years ago

In this library, AWS SDK for Java v1 is used. Unfortunately, according to this doc (https://docs.aws.amazon.com/sdkref/latest/guide/feature-sso-credentials.html), AWS SSO integration is not supported in AWS SDK for Java v1.

kherrera-ebsco commented 2 years ago

Are there any plans to upgrade the version of the library used?

moomindani commented 2 years ago

It is coming from Spark library dependency. When Spark upgrades the library, then you will be able to use those features. It may be good idea for you to file a feature request in OSS Spark JIRA. https://github.com/apache/spark/blob/master/pom.xml#L151