Thanos Compactor Failure : overlaps found while gathering blocks.

kedare commented 6 years ago

Thanos, Prometheus and Golang version used Docker images version: master-2018-08-04-8b7169b (Was also affecting an older version before)

What happened The thanos compactor is failing to run (crash)

What you expected to happen The thanos comptactor to compact :)

How to reproduce it (as minimally and precisely as possible): Good question, it was running before and when checking it on the server I found it restarting in loop (Because of the docker restart policy that is to restart on crash)

Full logs to relevant components

evel=info ts=2018-08-08T14:49:39.577203448Z caller=compact.go:231 msg="starting compact node"
level=info ts=2018-08-08T14:49:39.578379004Z caller=compact.go:126 msg="start sync of metas"
level=info ts=2018-08-08T14:49:39.579995202Z caller=main.go:243 msg="Listening for metrics" address=0.0.0.0:10902
level=info ts=2018-08-08T14:49:50.947989923Z caller=compact.go:132 msg="start of GC"
level=error ts=2018-08-08T14:49:50.954091484Z caller=main.go:160 msg="running command failed" err="compaction: pre compaction overlap check: overlaps found while gathering blocks. [mint: 1532649600000, maxt: 1532656800000, range: 2h0m0s, blocks: 2]: <ulid: 01CKCTVGR19AJBKHFC01YV2248, mint: 1532649600000, maxt: 1532656800000, range: 2h0m0s>, <ulid: 01CKCTVG2HS86DFDTZZ2GJM26W, mint: 1532649600000, maxt: 1532656800000, range: 2h0m0s>\n[mint: 1532685600000, maxt: 1532692800000, range: 2h0m0s, blocks: 2]: <ulid: 01CKDX64BMRYBJY5V9YY3WRYFS, mint: 1532685600000, maxt: 1532692800000, range: 2h0m0s>, <ulid: 01CKDX6534B4P3XF6EZ49MC4ND, mint: 1532685600000, maxt: 1532692800000, range: 2h0m0s>\n[mint: 1532613600000, maxt: 1532620800000, range: 2h0m0s, blocks: 2]: <ulid: 01CKBRGX8BN9S8SWG4KJ0XW3GM, mint: 1532613600000, maxt: 1532620800000, range: 2h0m0s>, <ulid: 01CKBRGVVF5RENZ12K5XYMPS3E, mint: 1532613600000, maxt: 1532620800000, range: 2h0m0s>\n[mint: 1532628000000, maxt: 1532635200000, range: 2h0m0s, blocks: 2]: <ulid: 01CKC68AZYW5Z1M2XZNW95KPXZ, mint: 1532628000000, maxt: 1532635200000, range: 2h0m0s>, <ulid: 01CKC68ACJ1VHATJ5Y39T61CNE, mint: 1532628000000, maxt: 1532635200000, range: 2h0m0s>\n[mint: 1532642400000, maxt: 1532649600000, range: 2h0m0s, blocks: 2]: <ulid: 01CKCKZRVJQ77NE05XB89R4VM8, mint: 1532642400000, maxt: 1532649600000, range: 2h0m0s>, <ulid: 01CKCKZSJ0QKEVE9N83HMT20T0, mint: 1532642400000, maxt: 1532649600000, range: 2h0m0s>\n[mint: 1532635200000, maxt: 1532642400000, range: 2h0m0s, blocks: 2]: <ulid: 01CKCD42YHRE9T7Y3XEJEBTM1M, mint: 1532635200000, maxt: 1532642400000, range: 2h0m0s>, <ulid: 01CKCD41MF1BBJQSC0QRQGWFEQ, mint: 1532635200000, maxt: 1532642400000, range: 2h0m0s>\n[mint: 1532656800000, maxt: 1532664000000, range: 2h0m0s, blocks: 2]: <ulid: 01CKD1Q7CGTFYRNG8ZQ88GYCJZ, mint: 1532656800000, maxt: 1532664000000, range: 2h0m0s>, <ulid: 01CKD1Q8250YVMT67JX14V9984, mint: 1532656800000, maxt: 1532664000000, range: 2h0m0s>\n[mint: 1532664000000, maxt: 1532671200000, range: 2h0m0s, blocks: 2]: <ulid: 01CKD8JZ827S7QXPM3B8B15R3N, mint: 1532664000000, maxt: 1532671200000, range: 2h0m0s>, <ulid: 01CKD8JYMFKQYBGEHST34F2VWS, mint: 1532664000000, maxt: 1532671200000, range: 2h0m0s>\n[mint: 1532671200000, maxt: 1532678400000, range: 2h0m0s, blocks: 2]: <ulid: 01CKDFENVK5JFKF7XCSXZYTADY, mint: 1532671200000, maxt: 1532678400000, range: 2h0m0s>, <ulid: 01CKDFEPJ0Y5K1KMJ1N2RWEQCR, mint: 1532671200000, maxt: 1532678400000, range: 2h0m0s>\n[mint: 1532678400000, maxt: 1532685600000, range: 2h0m0s, blocks: 2]: <ulid: 01CKDPADT2RQF99MTCX3KSFF2K, mint: 1532678400000, maxt: 1532685600000, range: 2h0m0s>, <ulid: 01CKDPAD3MV32M8YGN86WC0KH2, mint: 1532678400000, maxt: 1532685600000, range: 2h0m0s>\n[mint: 1532599200000, maxt: 1532606400000, range: 2h0m0s, blocks: 2]: <ulid: 01CKBDJ2R789A38KRMQFM7HQSG, mint: 1532599200000, maxt: 1532606400000, range: 2h0m0s>, <ulid: 01CKBASDZX4YKAKV42A86VTPM5, mint: 1532599200000, maxt: 1532606400000, range: 2h0m0s>\n[mint: 1532606400000, maxt: 1532613600000, range: 2h0m0s, blocks: 2]: <ulid: 01CKBHN60CPVCJW9R3BVNR8GD7, mint: 1532606400000, maxt: 1532613600000, range: 2h0m0s>, <ulid: 01CKBHN4KMZ3BNSB7EE4KH7ACT, mint: 1532606400000, maxt: 1532613600000, range: 2h0m0s>\n[mint: 1532620800000, maxt: 1532628000000, range: 2h0m0s, blocks: 2]: <ulid: 01CKBZCMG6BCZHT143FHEHV85X, mint: 1532620800000, maxt: 1532628000000, range: 2h0m0s>, <ulid: 01CKBZCK3FWKTZW8Z5PWH0PXSS, mint: 1532620800000, maxt: 1532628000000, range: 2h0m0s>"

Related configuration from our DSC (Salt)

Pillar / Variables

prometheus:
  retention: 1d
  version: v2.3.2

thanos:
  version: master-2018-08-04-8b7169b

Configuration

Make sure Prometheus is running in Docker:
  docker_container.running:
    - name: prometheus
    - image: prom/prometheus:{{ pillar["prometheus"]["version"] }}
    - user: 4000
    - restart_policy: always
    - network_mode: host
    - command: >
          --config.file=/etc/prometheus/prometheus-{{ grains["prometheus"]["environment"] }}.yml
          --storage.tsdb.path=/prometheus
          --web.enable-admin-api
          --web.console.libraries=/etc/prometheus/console_libraries
          --web.console.templates=/etc/prometheus/consoles
          --storage.tsdb.retention {{ pillar["prometheus"]["retention"] }}
          --storage.tsdb.min-block-duration=2h
          --storage.tsdb.max-block-duration=2h
          --web.enable-lifecycle
    - environment:
      - GOOGLE_APPLICATION_CREDENTIALS: /etc/prometheus/gcloud.json 
    - binds:
      - "/etc/monitoring/prometheus:/etc/prometheus:Z"
      - "/var/data/prometheus:/prometheus:Z"

Make sure Thanos Sidecar is running in Docker:
  docker_container.running:
    - name: thanos-sidecar
    - image: improbable/thanos:{{ pillar["thanos"]["version"] }}
    - user: 4000
    - restart_policy: always
    - network_mode: host
    - command: >
        sidecar
        --prometheus.url http://localhost:9090
        --tsdb.path /prometheus
        --gcs.bucket xxx-thanos
        --cluster.peers query.thanos.internal.xxx:10906
        --cluster.peers store.thanos.internal.xxx:10903
        --cluster.address 0.0.0.0:10900
        --cluster.advertise-address {{ grains["ip4_interfaces"]["eth0"][0] }}:10900
        --grpc-address 0.0.0.0:10901
        --grpc-advertise-address {{ grains["ip4_interfaces"]["eth0"][0] }}:10901
        --http-address 0.0.0.0:10902
    - environment:
      - GOOGLE_APPLICATION_CREDENTIALS: /etc/prometheus/gcloud.json 
    - binds:
      - "/etc/monitoring/prometheus:/etc/prometheus:Z"
      - "/var/data/prometheus:/prometheus:Z"

Make sure Thanos Query is running in Docker:
  docker_container.running:
    - name: thanos-query
    - image: improbable/thanos:{{ pillar["thanos"]["version"] }}
    - user: 4002
    - restart_policy: always
    - network_mode: host
    - command: >
        query
        --cluster.peers store.thanos.internal.xxx:10903
        --query.replica-label replica
        --cluster.address 0.0.0.0:10906
        --cluster.advertise-address {{ grains["ip4_interfaces"]["eth0"][0] }}:10906
        --grpc-address 0.0.0.0:10907
        --grpc-advertise-address {{ grains["ip4_interfaces"]["eth0"][0] }}:10907
        --http-address 0.0.0.0:10908
        --http-advertise-address {{ grains["ip4_interfaces"]["eth0"][0] }}:10908
    - binds:
      - "/var/data/thanos:/var/data/thanos:Z"
      - "/etc/monitoring/prometheus:/etc/prometheus:Z"
    - environment:
      - GOOGLE_APPLICATION_CREDENTIALS: /etc/prometheus/gcloud.json

Make sure Thanos Store is running in Docker:
  docker_container.running:
    - name: thanos-store
    - image: improbable/thanos:{{ pillar["thanos"]["version"] }}
    - user: 4002
    - restart_policy: always
    - network_mode: host
    - command: >
        store
        --tsdb.path /var/data/thanos/store
        --cluster.peers query.thanos.internal.xxx:10900
        --gcs.bucket xxx-thanos
        --cluster.address 0.0.0.0:10903
        --cluster.advertise-address {{ grains["ip4_interfaces"]["eth0"][0] }}:10903
        --grpc-address 0.0.0.0:10904
        --grpc-advertise-address {{ grains["ip4_interfaces"]["eth0"][0] }}:10904
        --http-address 0.0.0.0:10905
    - binds:
      - "/var/data/thanos:/var/data/thanos:Z"
      - "/etc/monitoring/prometheus:/etc/prometheus:Z"
    - environment:
      - GOOGLE_APPLICATION_CREDENTIALS: /etc/prometheus/gcloud.json 

Make sure Thanos Compactor is running in Docker:
  docker_container.running:
    - name: thanos-compactor
    - image: improbable/thanos:{{ pillar["thanos"]["version"] }}
    - user: 4002
    - restart_policy: on-failure
    - command: >
        compact
        --data-dir /var/data/thanos/compact
        --gcs.bucket xxx-thanos
    - binds:
      - "/var/data/thanos:/var/data/thanos:Z"
      - "/etc/monitoring/prometheus:/etc/prometheus:Z"
    - environment:
      - GOOGLE_APPLICATION_CREDENTIALS: /etc/prometheus/gcloud.json 

Set Thanos compactor cron task:
  cron.present:
    - name: docker restart thanos-compactor
    - user: root
    - minute: 0

Let me know if you need any more information.

kedare commented 6 years ago

Deleting manually the blocks from the bucket fixed the issue, the bucket verify could detect them but not repair them, maybe there could be an option allowing to backup and delete the original duplicated blocks to unblock the situation without having to do it manually ?

bwplotka commented 6 years ago

Thanks for reporting! This is a very good question. We wanted the repair job to be a "whitebox" repair. So it is not repairing if it does not know to what exact issue it related. This is to avoid the problem of removing the block without thinking what exactly was wrong, which is important. Otherwise, how to fix the SOURCE of the issue?

In your case I would love to know what happened? Did you do anything manually with blocks or ANY manual operation on object storage or use bucket verify with compactor running in the system. Did you accidently run 2 compactors or is your configuration unique in any way? (: Or maybe compactor crashed loop somewhen in the past? Do you have logs of it?

I think the repair you propose makes sense, in the end, to actually unblock users and investigate later (: Maybe even automated mitigation in compactor would be necessary. One idea would be to improve TSDB compaction to https://github.com/prometheus/tsdb/issues/90

But the investigation part is really necessary!

kedare commented 6 years ago

What happened exactly is a good question, I never did any manual change or previous repair on the blocks directly, everything is managed by Thanos directly, it's a quite standard setup. We have many projects/dc, each being managed by an independent prometheus/thanos-sidecar and each having their own targets and replica label so I can't explain how there could have been overlapping data. The compactor is run as described in our Salt configuration, it's just configured on a single instance, single container configuration that we run every hour with a docker restart thanos-compactor, maybe we could change this do docker start thanos-compactor to not interrupt potential compaction in progres (But afaik it never runs for more than 1h) If this could help you, I could send you the duplicated blocks in private if you want to take a look if that can help for troubleshooting, I backed them up (only the duplicates ones).

bwplotka commented 6 years ago

Sorry for delayed answer. Could you reproduce it anymore? It might be fixed in v0.1.0 There is one important comparison you could make. To check if those blocks are from same scrapers or not. Let's reopen if you can repro.

Lord-Y commented 5 years ago

Same here @kedare with thanos version 0.4.0 with my pod in Kubernetes crashing everytime.

level=error ts=2019-05-28T18:09:14.39872194Z caller=main.go:182 msg="running command failed" err="error executing compaction: compaction failed: compaction failed for group 0@{cluster=\"sue-gke-cluster01-euw1-rec\",environment=\"recette\",prometheus=\"prometheus-operator/prometheus-operator-prometheus\",prometheus_replica=\"prometheus-prometheus-operator-prometheus-0\"}: pre compaction overlap check: overlaps found while gathering blocks. [mint: 1556258400000, maxt: 1556265600000, range: 2h0m0s, blocks: 2]: <ulid: 01D9CDYWRH1854GZKNWN4X8K6P, mint: 1556258400000, maxt: 1556265600000, range: 2h0m0s>, <ulid: 01D9CDYWX2JEQMPKVYSC86XK90, mint: 1556258400000, maxt: 1556265600000, range: 2h0m0s>" After deleting these 2 01D9CDYWX2JEQMPKVYSC86XK90 01D9CDYWRH1854GZKNWN4X8K6P, everything was okay.

amartorelli commented 5 years ago

Had the same problem, same messages as @Lord-Y. Not sure what it was for me, but I fixed it by manually deleting from the bucket one of the folders for each time range and restarting the thanos-compactor pod.

Update this was just a temporary fix. I think there's an underlying problem. In my scenario I have two instances of prometheus as statefulsets.

Lord-Y commented 5 years ago

@amartorelli it's definitely something that need to be fixed. I also have statefulsets for my prometheis instances. This issue happened after migrating from 0.3.2 to 0.4.0 version.

amartorelli commented 5 years ago

I've noticed that the overlaps are checked with tsdb.OverlappingBlocks(metas). According to the function definition in the tsdb package it only checks for the timestamps: https://github.com/prometheus/tsdb/blob/8eeb70fee1fcee33c8821502c09ddb5ed3e450c0/db.go#L773

Is it safe to say that, even if the timestamps overlap, in case the meta files inside the folders contains unique labels, data has been pushed by two different instances of Prometheus (different external_labels sets). hence they should pass the OverlappingBlocks check?

Specifically if:

there's at least a label in folder1/meta.json that isn't in folder2/meta.json
if all label names are present in both folders, there's at least one label value that differs between folder1/meta.json and folder2/meta.json

jleloup commented 5 years ago

I also have the same issue as @Lord-Y & @amartorelli . I'm running two Prometheus replicas from a stateful set (deployed from Prometheus Operator Helm chart) with Thanos sidecar v0.5.0 and from time to time I end up with some overlapping blocks & Thanos Comapctor in CrashLoopBackoff.

As of now I don't have a lot of clues but I guess this has to be linked to the fact that I'm currently performing tests/updates on this Prometheus (notably adding/refactoring scrape configs) which leads to lots of Prometheus pod restarts.

Edit: I discovered an issue for the persistent storage of my Prometheus deployment. Now that this is fixed I don't have any Thanos Comapctor error / duplicated blocks.

e9169 commented 5 years ago

Hi. Thanos Compactor is complaining about the same error and it references two blobs that, if I do a bucket inspect, they belong to the same prometheus replica.

| 01DDAT9V8SHHTAYJXNJPTKVSP6 | 14-06-2019 08:00:00 | 14-06-2019 10:00:00 | 2h0m0s | 38h0m0s | 345,454 | 81,584,495 | 684,780 | 1 | false | cluster=ci,env=ci,prometheus=monitoring/k8s,prometheus_replica=prometheus-k8s-0 | 0s | sidecar |

| 01DDAT9VYEZ4QTVRJC4NJBT27F | 14-06-2019 08:00:00 | 14-06-2019 10:00:00 | 2h0m0s | 38h0m0s | 58,675 | 13,924,099 | 116,993 | 1 | false | cluster=ci,env=ci,prometheus=monitoring/k8s,prometheus_replica=prometheus-k8s-0 | 0s | sidecar |

As you can see, there are two blobs for the same time and date and the same replica of prometheus with the same compression but different number of series, samples and chunks. We are using 0.3.2 and Prometheus 2.5.0 (from Prometheus-operator 0.30).

We have deleted all the blobs in the storage account (azure) and we are still getting the overlapping error. Has this been solved in newer versions?

Thank you

anas-aso commented 5 years ago

@bwplotka I just faced the same issue. Nothing special about my setup (2 Prometheus instance scrapping independently the same targets). Thanos version : 0.6.0 I tried running thanos bucket verify --repair but I got : msg="repair is not implemented for this issue" issue=overlapped_blocks. Any plans to implement a repair for overlapped_blocks ?

pkrishnath commented 5 years ago

same here for thanos version 0.6.0

farvour commented 4 years ago

Experiencing this issue still in 0.6.x series.

anoop2503 commented 4 years ago

Getting the same issue on the versions v0.8.1 and v0.9.0

bashims commented 4 years ago

We are having the same issue, running with 3 Prometheus statefulset pods.

bwplotka commented 4 years ago

Let me revisit this ticket again.

All known causes of overlaps are misconfiguraiton. W tried our best to explain all potential problems and solutions here: https://thanos.io/operating/troubleshooting.md/ . No automatic repair is possible in this case. (:

Super happy we have finally nice doc about that thanks to @daixiang0. Let's iterate over it if there is something missing there. :hugs:

larsks commented 4 years ago

@bwplotka I'd like to revisit this issue, if you have a moment. We have a very simple stack set up using this docker-compose configuration. It has:

One Prometheus instance
One Grafana instance
One Thanos sidecar
One Thanos store server
One Thanos query server
Object storage in OpenStack

After running for just a couple of days, we're running into the "error executing compaction: compaction failed: compaction failed for group ...: pre compaction overlap check: overlaps found while gathering blocks." error.

The troubleshooting documents suggests the following reasons for this error:

Misconfiguraiton of sidecar/ruler: Same external labels or no external labels across many block producers.

We only have a single sidecar (and no ruler).
Running multiple compactors for single block “stream”, even for short duration.

We only have a single compactor.
Manually uploading blocks to the bucket.

This never happened.
Eventually consistent block storage until we fully implement RW for bucket

I'm not entirely sure what this is suggesting. Given only a single producer of information and a single bucket for storage, I'm not sure how eventual consistency could be a problem.

If you have a minute (or @daixiang0 or someone else) I would appreciate some insight into what could be causing this problem.

We're running:

$ docker-compose exec thanos_store thanos --version
thanos, version 0.12.0-dev (branch: master, revision: d18e1aec64ca3de143930a87d60bc52fe733e682)
  build user:       circleci@db15d248cc4d
  build date:       20200304-16:25:49
  go version:       go1.13.1

With:

$ docker-compose exec prom_main prometheus --version
prometheus, version 2.16.0 (branch: HEAD, revision: b90be6f32a33c03163d700e1452b54454ddce0ec)
  build user:       root@7ea0ae865f12
  build date:       20200213-23:50:02
  go version:       go1.13.8

bwplotka commented 4 years ago

Thanks for very clean write up! @larsks

Eventually consistent block storage until we fully implement RW for bucket

I'm not entirely sure what this is suggesting. Given only a single producer of information and a single bucket for storage, I'm not sure how eventual consistency could be a problem.

Well, I actually think that is the issue. I remember someone else was reporting that Swift is not strongly consistent. Eventual consistency actually creates thousands of issues. You have a single producer, yes, but imagine Compactor is creating a block. Then it removes the old block, because it just created block, right? So all good! So it deletes the block, and starts new iteration. Now we can have so many different cases:

Compactor sees old block deleted, a new one created (strong consistency)
Compactor sees old block still present, new one present as well (Eventual consistency). This causes overlap, but for a short time until SWIFT will get propagate the state.
Compactor sees old block deleted, the new one not visible still (Eventual consistency). This causes compactor to see a gap, so will potentially compact other blocks together WITHOUT a single block, causing the bigger block to have a gap. Once the new block is suddenly available (eventually), we have serious overlap.
Compactor sees old block still present, the new one not visible. (Eventual consistency). This will create exactly the same block once again, with a different ID, again causing overlap (:

Overall we spent so much time on a design solution that will work for kind of very rare case of eventual consistent storages... so trust us. We can with @squat and @khyatisoneji elaborate what more can be wrong on such cases... And in the end you can read more details here on was done and what is still planned: https://thanos.io/proposals/201901-read-write-operations-bucket.md/

Overall the new Thanos version will help you a lot, but still, there is an issue with compactor replicating blocks by accident on eventually consistent storages. We are missing this item: https://github.com/thanos-io/thanos/issues/2283

In the meantime I think you can try enabling vertical compaction. This will ensure that compactor will handle overlaps.. by simply compacting again into one. This is experimental though. cc @kakkoyun @metalmatze

Ideally, I would suggest using anything else than Swift, as other object storages have no issues like this.

larsks commented 4 years ago

Thanks for the response! I have a few questions for you:

Overall the new Thanos version will help you a lot,

Do you mean "a future version of Thanos" or do you mean we should simply upgrade to master?

In the meantime I think you can try enabling vertical compaction.

Is that as simple as adding --deduplication.replica-label to the thanos compactor invocation? "Vertical compaction" isn't mention explicitly anywhere other than in CHANGELOG.md right now.

Ideally, I would suggest using anything else than Swift, as other object storages have no issues like this.

Is it okay to use filesystem-backed storage? It has all sorts of warnings in https://github.com/thanos-io/thanos/blob/master/docs/storage.md, but we're not interested in a paid option like GCS or S3, and we don't have a local S3-analog other than Swift. I guess we could set up something like MinIO, but requiring a separate storage service just for Thanos isn't a great option.

bwplotka commented 4 years ago

Do you mean "a future version of Thanos" or do you mean we should simply upgrade to master?

I mean v0.12.0-rc.1. The one with @khyatisoneji delayed delete feature.

Is that as simple as adding --deduplication.replica-label to the thanos compactor invocation? "Vertical compaction" isn't mentioned explicitly anywhere other than in CHANGELOG.md right now.

Yes! In fact, you don't need a replica label, you can even put not-existed-I-don't-have-any but this will make Compactor unblock to combine overlaps together instead of halting.. but this is kind of bug (cc @kakkoyun) than feature and we plan to start halting on unexpected overlaps as well.

:warning: :warning: This is not documented for a reason. We just added experimental support for it and it does NOT work for Prometheus HA replicas yet. Also if you experience block overlap and you just enable this: Be aware that this feature was not made for overlap but for offline deduplication, so we never thought or tested all the bad cases that can happen, because of it (: I just "think" I it might work as a side effect. Help wanted to experiment more with this. Never on production setup though :warning: :warning:

Is it okay to use filesystem-backed storage? It has all sorts of warnings in https://github.com/thanos-io/thanos/blob/master/docs/storage.md, but we're not interested in a paid option like GCS or S3, and we don't have a local S3-analog other than Swift. I guess we could set up something like MinIO, but requiring a separate storage service just for Thanos isn't a great option.

Really depends on your needs and amount of data. It has warnings, to avoid cases like users being too smart and running on NFS (: etc. If your data fits on disk and you are fine on manually backing up, resizing disk operations, then filesystem should do just fine. Please feel free to test it out, it is definitely production-grade tested and maintained. Can rephrase docs to state so.

larsks commented 4 years ago

I switched over to using filesystem-backed storage on 4/10, and it's survived the past several days without error. Looks like it was a storage issue, so hopefully things will be able for a bit now.

bwplotka commented 4 years ago

Yes let us know how it goes. I am pretty sure it should be stable, so we can remove experimental mention in the docs (:

Bartek

On Mon, 13 Apr 2020 at 16:29, Lars Kellogg-Stedman notifications@github.com wrote:

I switched over to using filesystem-backed storage on 4/10, and it's survived the past several days without error. Looks like it was a storage issue, so hopefully things will be able for a bit now.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/thanos-io/thanos/issues/469#issuecomment-612948871, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABVA3O3SBFKI24XKLH4DGXDRMMVWTANCNFSM4FOQQK7Q .

B0go commented 1 year ago

Hey everyone 👋🏼 I would like to explore the possibility of adding some sort of fixing command to the thanos tools bucket group of subcommands to facilitate handling this kind of issues.

This is a very good question. We wanted the repair job to be a "whitebox" repair. So it is not repairing if it does not know to what exact issue it related. This is to avoid the problem of removing the block without thinking what exactly was wrong, which is important. Otherwise, how to fix the SOURCE of the issue?

and

All known causes of overlaps are misconfiguraiton. W tried our best to explain all potential problems and solutions here: https://thanos.io/operating/troubleshooting.md/ . No automatic repair is possible in this case. (:

@bwplotka Thanks for all the details shared on this issue! While I understand why the team decided to avoid fixing the problem without knowing the origin of the issue, in the end, users do need to act somehow to clean up the bucket so Thanos compactor can get back to work in the newly added data and everything else that isn't overlapping even if the root cause of the problem was a misconfiguration. Depending on the time window and the amount of data stored in the bucket, the effort required to clean it can get pretty big, forcing users to write their own scripts, and risking ending up in an even worse situation.

I would like to propose the possibility of adding a complementary command, or even evolving the current one (thanos tools bucket verify --repair) with the ability to move the affected blocks to a backup bucket, or something similar to that, in a way that users could get compactor back to running and then decide what to do with the affected data. We could also consider implementing a way to move the data back once it's sorted out 🤔

If that's something that makes sense for the project, I would love to explore the topic and contribute. I would appreciate some feedback and guidance on this (should I open a new issue?)

Thanks

jan-kantert commented 10 months ago

@B0go that would be awesome. We experience this issue every few weeks on one of our clusters. Seems to happen randomly. We already got a dedicated alert in monitoring with runbook for what to delete. I guess we could just run the repair command as cronjob proactively in the future to prevent that kind of manual toil.

thanos-io / thanos

Thanos Compactor Failure : overlaps found while gathering blocks. #469