thanos-io / thanos

Highly available Prometheus setup with long term storage capabilities. A CNCF Incubating project.
https://thanos.io
Apache License 2.0
13.09k stars 2.1k forks source link

Thanos Sidecar err="check exists: stat s3 object: Access Denied." #3677

Closed abhijeetab07 closed 3 years ago

abhijeetab07 commented 3 years ago

Hi Team,

/usr/local/bin/thanos sidecar --tsdb.path=/var/lib/prometheus --prometheus.url=http://localhost:9090 --objstore.config-file /var/lib/thanos/objectstore.yaml
level=info ts=2020-12-28T10:34:58.137650889Z caller=main.go:98 msg="Tracing will be disabled"
level=info ts=2020-12-28T10:34:58.13788683Z caller=options.go:23 protocol=gRPC msg="disabled TLS, key and cert must be set to enable"
level=info ts=2020-12-28T10:34:58.138189187Z caller=factory.go:46 msg="loading bucket configuration"
level=info ts=2020-12-28T10:34:58.138518739Z caller=sidecar.go:291 msg="starting sidecar"
level=info ts=2020-12-28T10:34:58.138653441Z caller=reloader.go:183 component=reloader msg="nothing to be watched"
level=info ts=2020-12-28T10:34:58.138705885Z caller=intrumentation.go:48 msg="changing probe status" status=ready
level=info ts=2020-12-28T10:34:58.139555567Z caller=grpc.go:116 service=gRPC/server component=sidecar msg="listening for serving gRPC" address=0.0.0.0:10901
level=info ts=2020-12-28T10:34:58.13960344Z caller=intrumentation.go:60 msg="changing probe status" status=healthy
level=info ts=2020-12-28T10:34:58.139616301Z caller=http.go:58 service=http/server component=sidecar msg="listening for requests and metrics" address=0.0.0.0:10902
level=info ts=2020-12-28T10:34:58.147099692Z caller=sidecar.go:155 msg="successfully loaded prometheus external 
level=info ts=2020-12-28T10:34:58.147144958Z caller=intrumentation.go:48 msg="changing probe status" status=ready
level=warn ts=2020-12-28T10:35:00.215032014Z caller=sidecar.go:275 err="check exists: stat s3 object: Access Denied." uploaded=0
level=warn ts=2020-12-28T10:35:30.176978953Z caller=sidecar.go:275 err="check exists: stat s3 object: Access Denied." uploaded=0
level=warn ts=2020-12-28T10:36:00.170973791Z caller=sidecar.go:275 err="check exists: stat s3 object: Access Denied." uploaded=0
level=warn ts=2020-12-28T10:36:30.167065199Z caller=sidecar.go:275 err="check exists: stat s3 object: Access Denied." uploaded=0

When I manually do this, it works completely fine:

aws s3 ls s3://my-bucket
           test.txt

I can see an issue related to it #394 and also #450 merged, so I had upgraded my thanos to 0.17.2

The Role on my S3 bucket has the below permissions as given in the docs:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Statement",
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket",
                "s3:GetObject",
                "s3:DeleteObject",
                "s3:PutObject"
            ],
            "Resource": [
                "arn:aws:s3:::<bucket>/*",
                "arn:aws:s3:::<bucket>"
            ]
        }
    ]
}

Thanos, version 0.17.2

Prometheus, version 2.11.1

Environment:

Could you please have a look and help me understand if my approach is correct or not so as to avoid using ACCESS and SECRET KEY and work with Instance profile. Also, please help me know if I have missed anything. Thanks a lot in advance.

yeya24 commented 3 years ago

I don't have many experiences with S3 but it seems our doc is outdated.

From the error logs, stat operation is denied. Sidecar does this here, so maybe you need to add s3:HeadObject to the policies.

If this works for you, then we need to update that doc.

abhijeetab07 commented 3 years ago

Thank you @yeya24 for your reply. I added s3:HeadObject still the same issue, also s3:HeadObject is not recognized as an action in the IAM policy, so I tried with s3:GetObject but still getting the same errors. Even with full s3 permissions it's not working for us.

abhijeetab07 commented 3 years ago

Hello Team, can anyone have a look at it please. Thanks in Advance.

abhijeetab07 commented 3 years ago

Hello Team, Did anyone get a chance to look into it ? Thanks.

kakkoyun commented 3 years ago

Pinging @bwplotka as s3 object-store owner (Sorry for the noise)

kmai commented 3 years ago

I can confirm this is happening to me as well. Also with a cross-account policy using IRSA in EKS.

jerry153fish commented 3 years ago

Same here as well after upgrading to latest thanos. It worked before.

argychatzi commented 3 years ago

Same here! @jerry153fish can you please share which version worked for you?

jerry153fish commented 3 years ago

v0.14 is the working one.

stale[bot] commented 3 years ago

Hello 👋 Looks like there was no activity on this issue for the last two months. Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗 If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

stale[bot] commented 3 years ago

Closing for now as promised, let us know if you need this to be reopened! 🤗

DerrickMartinez commented 3 years ago

This still an issue?

spjspjspj commented 2 years ago

I'm on 0.20.0 Think I'm having the same issue.

pathikritmodak commented 2 years ago

Facing the same issue on v0.24.0, it's weird that one of the prometheus replica is facing this issue and the other one is working fine.

ottramst commented 2 years ago

Can confirm the same issue on v0.25.2

kumarganesh2814 commented 2 years ago

I am having same issue.

{ "Version": "2012-10-17", "Statement": [ { "Sid": "Restrict Non-https Requests", "Effect": "Deny", "Principal": "*", "Action": "s3:GetObject", "Resource": "arn:aws:s3:::mybucket/*", "Condition": { "Bool": { "aws:SecureTransport": "false" } } } ] }

level=info ts=2022-05-19T12:14:45.237899866Z caller=factory.go:46 msg="loading bucket configuration" level=info ts=2022-05-19T12:14:45.238363784Z caller=inmemory.go:172 msg="created in-memory index cache" maxItemSizeBytes=131072000 maxSizeBytes=262144000 maxItems=maxInt level=info ts=2022-05-19T12:14:45.238920072Z caller=options.go:24 protocol=gRPC msg="disabled TLS, key and cert must be set to enable" level=info ts=2022-05-19T12:14:45.240592638Z caller=store.go:428 msg="starting store node" level=info ts=2022-05-19T12:14:45.240665782Z caller=store.go:363 msg="initializing bucket store" level=info ts=2022-05-19T12:14:45.240755799Z caller=intrumentation.go:60 msg="changing probe status" status=healthy level=info ts=2022-05-19T12:14:45.240811358Z caller=http.go:63 service=http/server component=store msg="listening for requests and metrics" address=0.0.0.0:10902 level=info ts=2022-05-19T12:14:45.24097428Z caller=tls_config.go:191 service=http/server component=store msg="TLS is disabled." http2=false level=warn ts=2022-05-19T12:14:45.984182456Z caller=intrumentation.go:54 msg="changing probe status" status=not-ready reason="bucket store initial sync: sync block: incomplete view: 1036 errors: meta.json file exists: 01FW85KR4R5FDMG427SKHJDF8A/meta.json: stat s3 object: Access Denied.; meta.json file exists: 01FW7QW9MRDTPDM9V5J26M9JC2/meta.json: stat s3 object: Access Denied.; meta.json file exists: 01FWF1B05EM9F83YNQH9QZ0A04/meta.json: stat s3 object: Access Denied.; meta.json file exists: 01FWD3HEX2QNZ3XPHC3ATYY2RV/meta.json: stat s3 object: Access Denied.; meta.json file exists: 01FW912N4TT88X8H3F05C7P2XR/meta.json: stat s3 object: Access Denied.; meta.json file exists: 01FW7YR0WTGAAWKNMRHWNFB238/meta.json: stat s3 object: Access Denied.; meta.json file exists: 01FWADQSCJF50X3A27V4KRXCN3/meta.json: stat s3 object: Access Denied.; meta.json file exists: 01FWDZ0BWSHBK96M9RFHC7D9WG/meta.json: stat s3 object: Access Denied.; meta.json file exists: 01FW9WHJ4VHKJN7RRTNEG0W9QE/meta.json: stat s3 object: Access Denied.; meta.json file exists: 01FWDAD64RE0JCRE47PF7XK8EC/meta.json: stat s3 object: Access Denied.; meta.json file exists: 01FWCNT0D6YFXZKS8M3XFMV9TJ/meta.json: stat s3 object: Access Denied.; meta.json file exists: 01FWC82HWYS54ADPFWY0MCFQWC/meta.json: stat s3 object: Access Denied.; meta.json file exists: 01FWECQTCTR23FJ9R96WTF0K4G/meta.json: stat s3 object: Access Denied.;

victorlin-houzz commented 2 years ago

For cross account s3 bucket write, you need to add this into your s3 configmap (mine is from amazon/k8s): put_user_metadata: {"X-Amz-Acl": "bucket-owner-full-control"} This will grant the main account permission to read/write the objs sent from other accounts.

kumarganesh2814 commented 2 years ago

Hi @victorlin-houzz

I have resolved this issue by adding correct permission. Thanks for suggestion

Best Regards Ganesh Kumar

Lincon-Freitas commented 2 years ago

Hi @kumarganesh2814! What do you mean by correct permission? I am having the same problem and am pretty sure it is not a permission problem as I even allowed anything from anywhere in my S3 bucket policy for testing and still not working. Would love to understand how you fixed it.

kumarganesh2814 commented 2 years ago

Hi @Lincon-Freitas As far as I know issue was my EKS cluster EC2 nodes were'nt able to connect to bucket specified adding policy helped

{ "Version": "2012-10-17", "Statement": [ { "Sid": "Statement", "Effect": "Allow", "Principal": "*", "Action": [ "s3:ListBucket", "s3:GetObject", "s3:DeleteObject", "s3:PutObject" ], "Resource": [ "arn:aws:s3:::prometheus-monitoring/*", "arn:aws:s3:::prometheus-monitoring" ], "Condition": { "Bool": { "aws:SecureTransport": "false" } } } ] }

Also please check IAM role's policy associated with your instance

Best Regards Ganesh

Lincon-Freitas commented 2 years ago

Hi @kumarganesh2814

Thanks! I found out the problem, just to keep it here (may help others):

I am using IRSA on EKS to write on an S3 bucket from another AWS account. I was configuring the S3 bucket resource policy properly to allow the external account but was not giving the right permissions on the IAM role associated with the service account. Anyway, thanks for replying!

sherifkayad commented 2 years ago

@Lincon-Freitas do you mind sharing your final IAM policies?

Lincon-Freitas commented 2 years ago

Hey @sherifkayad, sure!

This is the policy attached to the IAM role:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "",
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:ListBucket",
                "s3:GetObject",
                "s3:DeleteObject"
            ],
            "Resource": [
                "arn:aws:s3:::mybucket-monitor/*",
                "arn:aws:s3:::mybucket-monitor"
            ]
        }
    ]
}

This is the resource policy configured in my S3 bucket:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "",
            "Effect": "Allow",
            "Principal": {
                "AWS": [
                    "arn:aws:iam::123:root",
                    "arn:aws:iam::456:root",
                    "arn:aws:iam::789:root"
                ]
            },
            "Action": [
                "s3:PutObject",
                "s3:ListBucket",
                "s3:GetObject",
                "s3:DeleteObject"
            ],
            "Resource": [
                "arn:aws:s3:::mybucket-monitor/*",
                "arn:aws:s3:::mybucket-monitor"
            ]
        }
    ]
}

Long time update: After making some changes on the architecture recently I also had to add the s3:GetObjectAcl and s3:PutObjectAcl actions because of the put_user_metadata: {"X-Amz-Acl": "bucket-owner-full-control"} setting.

yassineaouadi commented 1 year ago

i reproduce the issue when upgrading thanos-sidecar base image to 0.28.1 then 0.29.0 all working good with 0.27.0 don't think it's related to s3/IAM policies unless breaking changes were introduced in 0.28.1

kailunwang-houzz commented 1 year ago

For cross account s3 bucket write, you need to add this into your s3 configmap (mine is from amazon/k8s): put_user_metadata: {"X-Amz-Acl": "bucket-owner-full-control"} This will grant the main account permission to read/write the objs sent from other accounts.

Thanks @victorlin-houzz . solved my problem

rgallis commented 1 year ago

I have this error implementing cross account IAM role with IRSA in eks cluster. There is no chance to configure thanos sidecar to assume the cross-account role which has the grant to access the S3 bucket on the other account. I would like to have cross account IAM role rather than using S3 bucket policy. Any chance to have it working in new releases?

smark88 commented 1 year ago

A google search led me to this issue, my issue was self inflicted and i'll post the solution to help anyone else out.

My issue was in the trust relationship and using wildcards for the KSA vs the full KSA. If you use a KSA with a wildcard like "system:serviceaccount:monitoring:prometheus-*" you must use StringLike, I ended up copy pasting StringEquals

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "",
            "Effect": "Allow",
            "Principal": {
                "Federated": "arn:aws:iam::1234:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/SADNAF23AS"
            },
            "Action": "sts:AssumeRoleWithWebIdentity",
            "Condition": {
                "StringLike": {
                    "oidc.eks.us-east-1.amazonaws.com/id/SADNAF23AS:sub": "system:serviceaccount:monitoring:prometheus-*"
                }
            }
        }
    ]
}
farisamour commented 5 months ago

fixed it by below:

1- wrong side car version updated: thanosio/thanos:main-2021-12-08-d1acaea2 --> thanosio/thanos:v0.32.5

2- wrong prometheus role name on prometheus helm chart : arn:aws:iam::4724xxxx3827:role/xyz/alfa-prod --> arn:aws:iam::4724xxxxx3827:role/alfa-prod

3- wrong trust relationship.

4- removed bucket ACLs.