Closed asmith60 closed 4 years ago
Hi, any update on this issue?
I am also getting the same issue that thanos store gateway is stuck with "initializing bucket store" when starting the container. No other warning/error is appearing in the log. Any idea why this is happening or how to find out the root cause of this issue?
The logs are given below:
level=info ts=2019-09-05T14:37:53.221491945Z caller=flags.go:75 msg="gossip is disabled" level=info ts=2019-09-05T14:37:53.222294564Z caller=factory.go:39 msg="loading bucket configuration" level=debug ts=2019-09-05T14:37:53.223374047Z caller=store.go:128 msg="initializing bucket store"
Thanks,
Sorry for delay!
Store Gateway Startup grabs portion of the objects into memory and thus if you don't have compactor (do you have it? Is it working?) it will be quite a long process, plus memory intensive.
Most likely Store is just OOMing for your case. Give more memory, time shard store gateway (see: https://github.com/thanos-io/thanos/pull/1077), or add compactor if missing (!).
Things which we are planning to do:
However: We are planning to rework this space and allow store startup to be seemless :tada: Info: https://github.com/thanos-io/thanos/issues/1471
Hope that helps (:
@anoop2503 I just needed to give the store more time to startup (about 5 minutes in my case). It seems that the more memory I feed the store the less time it takes to start.
Also, we could and should probably be more verbose here at the debug level (or info) so that users would know what blocks we are pulling just like Prometheus, for example, prints what blocks it finds on the disk.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Thanos, Prometheus and Golang version used Thanos: 0.6.0 Prometheus: 2.10.0
What happened The Thanos store won't start. It tries to start up, but crashes in ~30 seconds. Inspecting the pod indicates that the process exited with a non-zero code. The log output with debug enabled is below.
What you expected to happen Thanos store to start successfully.
Anything else we need to know 6 HA pairs of Prometheus instances (12 total instances) are uploading metrics to the AWS S3 bucket. The current bucket size is ~750GB. The store pod manifest is below (I removed the obj-store config, AWS IAM config, etc)
``` apiVersion: apps/v1 kind: StatefulSet metadata: name: thanos-store namespace: monitoring labels: app: thanos-store spec: replicas: 3 selector: matchLabels: app: thanos-store serviceName: thanos-store template: metadata: labels: app: thanos-store spec: containers: - name: thanos-store imagePullPolicy: Always image: "improbable/thanos:v0.6.0" args: - store - --data-dir=/data - --log.level=debug - --index-cache-size=8GB - --chunk-pool-size=20GB ports: - name: http containerPort: 10902 protocol: TCP - name: grpc containerPort: 10901 protocol: TCP livenessProbe: httpGet: path: /metrics port: http readinessProbe: httpGet: path: /metrics port: http resources: limits: cpu: 2000m memory: 32000Mi requests: cpu: 2000m memory: 32000Mi volumeMounts: - mountPath: /data name: storage-volume volumeClaimTemplates: - metadata: name: storage-volume spec: accessModes: - ReadWriteOnce resources: requests: storage: "128Gi" ```