apache / druid

Apache Druid: a high performance real-time analytics database.
https://druid.apache.org/
Apache License 2.0
13.43k stars 3.69k forks source link

Bad/outdated documentation on "S3 permissions settings" #10516

Closed brskq closed 11 months ago

brskq commented 3 years ago

Druid ingestion tasks fail when applying the required S3 permissions stated in the documentation which says:

S3 permissions settings

s3:GetObject and s3:PutObject are basically required for pushing/loading segments to/from S3. If druid.storage.disableAcl is set to false, then s3:GetBucketAcl and s3:PutObjectAcl are additionally required to set ACL for objects.

I had also set the AWS region in the jvm.config files. I had to get in touch with the AWS support in order to figure out what was wrong and they came back to me with the following info:

It looks like the Druid is indeed making requests for checking ACL as well even if it not mentioned in the Druid documentation doc since the request failed at operation "REST.GET.ACL".

So the user arn:aws:iam::123456789101:user/my-bucket-user will have to have the below policy added to it as minimum permissions required.


{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:GetObjectAcl",
                "s3:GetObject",
                "s3:GetObjectVersionAcl",
                "s3:ListBucket",
                "s3:DeleteObject",
                "s3:GetBucketAcl",
                "s3:GetBucketLocation",
                "s3:PutObjectAcl"
            ],
            "Resource": [
                "arn:aws:s3:::my-s3-bucket",
                "arn:aws:s3:::my-s3-bucket/*"
            ]
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": "s3:ListAllMyBuckets",
            "Resource": "*"
        }
    ]
}
brskq commented 3 years ago

Let me also add my apache-druid/conf/druid/single-server/small/_common/common.runtime.properties:

# ms_conf_begin
druid.indexer.logs.s3Bucket=my-s3-bucket
druid.indexer.logs.s3Prefix=dev01/apache-druid/indexing-logs
druid.indexer.logs.type=s3
druid.metadata.storage.connector.connectURI=jdbc:postgresql://database-endpoint:5432/mydb
druid.metadata.storage.connector.password=mypw
druid.metadata.storage.connector.user=myusr
druid.metadata.storage.type=postgresql
druid.storage.baseKey=dev01/apache-druid/segments
druid.storage.bucket=my-s3-bucket
druid.storage.storageDirectory=/mnt/druid/druid/var/druid/segments
druid.storage.type=s3
# ms_conf_end
druid.emitter.logging.logLevel=info
druid.emitter=noop
druid.extensions.loadList=["druid-hdfs-storage", "druid-kafka-indexing-service", "druid-datasketches", "postgresql-metadata-storage", "druid-s3-extensions"]
druid.host=localhost
druid.indexer.logs.directory=var/druid/indexing-logs
druid.indexing.doubleStorage=double
druid.lookup.enableLookupSyncOnStartup=false
druid.monitoring.monitors=["org.apache.druid.java.util.metrics.JvmMonitor"]
druid.selectors.coordinator.serviceName=druid/coordinator
druid.selectors.indexing.serviceName=druid/overlord
druid.server.hiddenProperties=["druid.s3.accessKey","druid.s3.secretKey","druid.metadata.storage.connector.password"]
druid.sql.enable=true
druid.startup.logging.logProperties=true
druid.zk.paths.base=/druid
druid.zk.service.host=localhost

The setup is using the instance profile information for authentication.

mattmassicotte commented 3 years ago

@brskq thank you so much for posting this! I was having issues with historical nodes loading data from S3, and permissions was indeed the problem.

brskq commented 3 years ago

Hey @mattmassicotte, happy I could help! Have a wonderful day ahead :)

github-actions[bot] commented 1 year ago

This issue has been marked as stale due to 280 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions.

github-actions[bot] commented 11 months ago

This issue has been closed due to lack of activity. If you think that is incorrect, or the issue requires additional review, you can revive the issue at any time.