awslabs / aws-athena-query-federation

The Amazon Athena Query Federation SDK allows you to customize Amazon Athena with your own data sources and code.
Apache License 2.0
553 stars 290 forks source link

[BUG] Can't access serverless application from eu-west-1 from connector AthenaJdbcConnector #522

Closed Dandandan closed 2 years ago

Dandandan commented 2 years ago

Describe the bug When upgrading the version from `

arn:aws:serverlessrepo:us-east-1:292517598671:applications-AthenaJdbcConnector-versions-2021.27.1/c9f3ddcb-bfec-40d7-98fd-93f6481d74a2

to the latest

arn:aws:serverlessrepo:us-east-1:292517598671:applications-AthenaJdbcConnector-versions-2021.42.1/fb059e7c-610c-4956-b723-74e5db32b477

I am getting the following error:

"GetObject for awsserverlessrepo-changesets-18ssd5swmy82n/ACCOUNT_ID/arn:aws:serverlessrepo:us-east-1:292517598671:applications-AthenaJdbcConnector-versions-2021.42.1/fb059e7c-610c-4956-b723-74e5db32b477"

Reverting to the old version makes it work again, suggesting this has to do with the permissions on the bucket / object.

I have the problem with all versions from "2021.27.1".

To Reproduce

  1. Deploy a newer version (I am using sam to deploy the connector)
  2. Access is denied.

Expected behavior

Additional context Add any other context about the problem here.

janmran commented 2 years ago

How are you deploying the new connector? Via publish.sh or through Serverless App Repo Console? If via publish.sh, can you ensure the bucket policy from https://github.com/awslabs/aws-athena-query-federation/blob/master/tools/publish.sh#L87-L104 is indeed attached to you bucket and has the correct account id? If this isn't the case, Serverless App Repo won't be able to publish.

markokole commented 2 years ago

Same problem with version 2021.51.1: If I run a small query (I assume no spill required) I get the results in S3, If I run a bigger query I get the 403 access denied.

From Athena: Encountered an exception[com.amazonaws.services.s3.model.AmazonS3Exception] from your LambdaFunction[arn:aws:lambda:eu-west-1:831354200784:function:testfederatedquery] executed in context[S3SpillLocation{bucket='testfederatedquery-spill', key='athena-spill/5ff024a3-26e9-4469-aaa6-76b1fbf8d673/a498be1b-0215-4461-86e9-d3a5568f7996', directory=true}] with message[Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: YX2S809S5YV52YFP; S3 Extended Request ID: OcNEhpQZcKVDWnNrgP1LnOTeonnZyPvn1/Nb0pbcGp0l/+XCS9equYo3Pvx4nPT+shZbEfYXQL8=; Proxy: null)]

From Cloudwatch (Lambdas log): Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: YX2S809S5YV52YFP; S3 Extended Request ID: OcNEhpQZcKVDWnNrgP1LnOTeonnZyPvn1/Nb0pbcGp0l/+XCS9equYo3Pvx4nPT+shZbEfYXQL8=; Proxy: null): com.amazonaws.services.s3.model.AmazonS3Exception com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: YX2S809S5YV52YFP; S3 Extended Request ID: OcNEhpQZcKVDWnNrgP1LnOTeonnZyPvn1/Nb0pbcGp0l/+XCS9equYo3Pvx4nPT+shZbEfYXQL8=; Proxy: null), S3 Extended Request ID: OcNEhpQZcKVDWnNrgP1LnOTeonnZyPvn1/Nb0pbcGp0l/+XCS9equYo3Pvx4nPT+shZbEfYXQL8= at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1811) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1395) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1371) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1145) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:802) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:770) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:744) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:704) at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:686) at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:550) at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:530) at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5062) at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5008) at com.amazonaws.services.s3.AmazonS3Client.access$300(AmazonS3Client.java:394) at com.amazonaws.services.s3.AmazonS3Client$PutObjectStrategy.invokeServiceCall(AmazonS3Client.java:5950) at com.amazonaws.services.s3.AmazonS3Client.uploadObject(AmazonS3Client.java:1812) at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1772) at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1710) at com.amazonaws.athena.connector.lambda.data.S3BlockSpiller.write(S3BlockSpiller.java:313) at com.amazonaws.athena.connector.lambda.data.S3BlockSpiller.lambda$spillBlock$16(S3BlockSpiller.java:367) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)

Let me know if you need more information.

janmran commented 2 years ago

@markokole your issue is unrelated to the original. Your lambda function's role needs needs the S3CrudPolicy on your spill bucket.

Dandandan commented 2 years ago

@janmran

I deploy the connector/servless app directly via cloudformation by referencing the serverless app ARN. This worked in for the version 2021.27.1, but doesn't work for the version after that (getting the permissions error).

markokole commented 2 years ago

@janmran Im creating the lambda using aws console and inline policy JdbcConnectorConfigRolePolicy4 covers S3CrudPolicy

Heres the inline policy that is generated:

{ "Statement": [ { "Action": [ "s3:GetObject", "s3:ListBucket", "s3:GetBucketLocation", "s3:GetObjectVersion", "s3:PutObject", "s3:PutObjectAcl", "s3:GetLifecycleConfiguration", "s3:PutLifecycleConfiguration", "s3:DeleteObject" ], "Resource": [ "arn:aws:s3:::testfederatedquery-spill", "arn:aws:s3:::testfederatedquery-spill/*" ], "Effect": "Allow" } ] }

declark1 commented 2 years ago

I'm having the same access denied issue despite the correct S3 policy being attached to the role.

janmran commented 2 years ago

@Dandandan , make sure this policy is in your bucket policy:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "serverlessrepo.amazonaws.com"
            },
            "Action": "s3:GetObject",
            "Resource": "arn:aws:s3:::<bucket where the package connector code is uploaded to>/*",
            "Condition": {
                "StringEquals": {
                    "aws:SourceAccount": "<account id that is calling sam to publish the connector code>"
                }
            }
        }
    ]
}
janmran commented 2 years ago

@markokole If you look at the stack trace your provided, the exception is coming from S3 due to this call:

com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1710) at com.amazonaws.athena.connector.lambda.data.S3BlockSpiller.write(S3BlockSpiller.java:313) at

Are you positive the the role the lambda is using (NOT the the role the caller user is using to execute the query) has putObject permissions on the bucket?