thanos-io / thanos

Highly available Prometheus setup with long term storage capabilities. A CNCF Incubating project.
https://thanos.io
Apache License 2.0
12.73k stars 2.04k forks source link

sidecar failing with GKE workload identity #5362

Open pranay142 opened 2 years ago

pranay142 commented 2 years ago

Thanos, Prometheus and Golang version used:

Helm Chart: kube-prometheus-stack-34.0.0 Prometheus: quay.io/prometheus/prometheus:v2.33.5 Thanos: quay.io/thanos/thanos:v0.25.1

Object Storage Provider:

We use GCS as the storage backend @bwplotka

What happened: Instead of generating a json file with Google application credentials we used workcloud identity to create a Google Service Account and annotated the kubernetes service account(thanos) with it(https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity#gcloud_3).

Followed below steps thereafter:

  1. Create a file with the below configuration. type: GCS config: bucket: "bucket-test-fuse" service_account: "thanos"

  2. Created kubernetes secret with the following command. kubectl create secret generic thanos-gcp-config --from-file=objstore.yml=thanos-sidecar-secret-sa.yaml -n monitoring

  3. Updated object storage config as below. thanos: objectStorageConfig: key: objstore.yml name: thanos-gcp-config

  4. Installed the helm chart.

With the above configuration, thanos-sidecar is going into crashLoopBackOff state with the below error message.

kubectl logs -f prometheus-prometheus-kube-prometheus-prometheus-0 -c thanos-sidecar -n monitoring level=info ts=2022-05-13T13:15:36.105171546Z caller=options.go:27 protocol=gRPC msg="disabled TLS, key and cert must be set to enable" level=info ts=2022-05-13T13:15:36.105981183Z caller=factory.go:49 msg="loading bucket configuration" level=error ts=2022-05-13T13:15:36.106304295Z caller=main.go:132 err="invalid character 'h' in literal true (expecting 'r')\nfailed to create credentials from JSON\ngithub.com/thanos-io/thanos/pkg/objstore/gcs.NewBucketWithConfig\n\t/app/pkg/objstore/gcs/gcs.go:66\ngithub.com/thanos-io/thanos/pkg/objstore/gcs.NewBucket\n\t/app/pkg/objstore/gcs/gcs.go:52\ngithub.com/thanos-io/thanos/pkg/objstore/client.NewBucket\n\t/app/pkg/objstore/client/factory.go:63\nmain.runSidecar\n\t/app/cmd/thanos/sidecar.go:303\nmain.registerSidecar.func1\n\t/app/cmd/thanos/sidecar.go:73\nmain.main\n\t/app/cmd/thanos/main.go:130\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:255\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1581\ncreate GCS client\ngithub.com/thanos-io/thanos/pkg/objstore/client.NewBucket\n\t/app/pkg/objstore/client/factory.go:82\nmain.runSidecar\n\t/app/cmd/thanos/sidecar.go:303\nmain.registerSidecar.func1\n\t/app/cmd/thanos/sidecar.go:73\nmain.main\n\t/app/cmd/thanos/main.go:130\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:255\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1581\npreparing sidecar command failed\nmain.main\n\t/app/cmd/thanos/main.go:132\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:255\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1581"

FranAguiar commented 2 years ago

I'm facing a similar issue, explained here: https://github.com/bitnami/charts/issues/10399

Did you found a solution? Maybe I misunderstood the whole thing, the object store config must be in the thanos sidecars?

pranay142 commented 2 years ago

Yes this works fine as per the expectation.

Steps i did to configure prometheus and thanos.

Install prometheus:

  1. Create a secret with objstore information. kubectl create secret generic thanos-gcp-config --from-file=thanos.yaml=thanos-sidecar-secret.yaml -n monitoring
  2. Update thanos objectStorageConfig in kube-prometheus-stack helm chart as below. objectStorageConfig: key: objstore.yml name: thanos-gcp-config
  3. Install kube-prometheus-stack helm chart.
  4. Then test that you can at least list objects in the bucket, eg: thanos tools bucket ls --objstore.config="${OBJSTORE_CONFIG}"
  5. The sidecar uploads TSDB blocks to an object storage bucket as Prometheus produces them every 2 hours.

Install thanos:

  1. Update values.yaml with the secret name created in step 1 of prometheus installation. existingObjstoreSecret: "thanos-gcp-config"

There you go.

stale[bot] commented 1 year ago

Hello 👋 Looks like there was no activity on this issue for the last two months. Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗 If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

muyukha commented 1 year ago

The key for your secret is thanos.yml, but why use object.yml in the thanos objectStorageConfig?

raghu-manne commented 3 months ago

@pranay142 What did you pass as part of objstore information since you are using workload identity?

Install prometheus: 
Create a secret with objstore information.
kubectl create secret generic thanos-gcp-config --from-file=thanos.yaml=thanos-sidecar-secret.yaml -n monitoring