Closed kedare closed 6 years ago
Deleting manually the blocks from the bucket fixed the issue, the bucket verify
could detect them but not repair them, maybe there could be an option allowing to backup and delete the original duplicated blocks to unblock the situation without having to do it manually ?
Thanks for reporting! This is a very good question. We wanted the repair job to be a "whitebox" repair. So it is not repairing if it does not know to what exact issue it related. This is to avoid the problem of removing the block without thinking what exactly was wrong, which is important. Otherwise, how to fix the SOURCE of the issue?
In your case I would love to know what happened? Did you do anything manually with blocks or ANY manual operation on object storage or use bucket verify
with compactor running in the system. Did you accidently run 2 compactors or is your configuration unique in any way? (: Or maybe compactor crashed loop somewhen in the past? Do you have logs of it?
I think the repair you propose makes sense, in the end, to actually unblock users and investigate later (: Maybe even automated mitigation in compactor would be necessary. One idea would be to improve TSDB compaction to https://github.com/prometheus/tsdb/issues/90
But the investigation part is really necessary!
What happened exactly is a good question, I never did any manual change or previous repair on the blocks directly, everything is managed by Thanos directly, it's a quite standard setup.
We have many projects/dc, each being managed by an independent prometheus/thanos-sidecar and each having their own targets and replica label so I can't explain how there could have been overlapping data.
The compactor is run as described in our Salt configuration, it's just configured on a single instance, single container configuration that we run every hour with a docker restart thanos-compactor
, maybe we could change this do docker start thanos-compactor
to not interrupt potential compaction in progres (But afaik it never runs for more than 1h)
If this could help you, I could send you the duplicated blocks in private if you want to take a look if that can help for troubleshooting, I backed them up (only the duplicates ones).
Sorry for delayed answer. Could you reproduce it anymore? It might be fixed in v0.1.0 There is one important comparison you could make. To check if those blocks are from same scrapers or not. Let's reopen if you can repro.
Same here @kedare with thanos version 0.4.0 with my pod in Kubernetes crashing everytime.
level=error ts=2019-05-28T18:09:14.39872194Z caller=main.go:182 msg="running command failed" err="error executing compaction: compaction failed: compaction failed for group 0@{cluster=\"sue-gke-cluster01-euw1-rec\",environment=\"recette\",prometheus=\"prometheus-operator/prometheus-operator-prometheus\",prometheus_replica=\"prometheus-prometheus-operator-prometheus-0\"}: pre compaction overlap check: overlaps found while gathering blocks. [mint: 1556258400000, maxt: 1556265600000, range: 2h0m0s, blocks: 2]: <ulid: 01D9CDYWRH1854GZKNWN4X8K6P, mint: 1556258400000, maxt: 1556265600000, range: 2h0m0s>, <ulid: 01D9CDYWX2JEQMPKVYSC86XK90, mint: 1556258400000, maxt: 1556265600000, range: 2h0m0s>"
After deleting these 2 01D9CDYWX2JEQMPKVYSC86XK90
01D9CDYWRH1854GZKNWN4X8K6P
, everything was okay.
Had the same problem, same messages as @Lord-Y. Not sure what it was for me, but I fixed it by manually deleting from the bucket one of the folders for each time range and restarting the thanos-compactor pod.
Update this was just a temporary fix. I think there's an underlying problem. In my scenario I have two instances of prometheus as statefulsets.
@amartorelli it's definitely something that need to be fixed. I also have statefulsets for my prometheis instances. This issue happened after migrating from 0.3.2 to 0.4.0 version.
I've noticed that the overlaps are checked with tsdb.OverlappingBlocks(metas)
.
According to the function definition in the tsdb package it only checks for the timestamps:
https://github.com/prometheus/tsdb/blob/8eeb70fee1fcee33c8821502c09ddb5ed3e450c0/db.go#L773
Is it safe to say that, even if the timestamps overlap, in case the meta files inside the folders contains unique labels, data has been pushed by two different instances of Prometheus (different external_labels sets). hence they should pass the OverlappingBlocks check?
Specifically if:
I also have the same issue as @Lord-Y & @amartorelli . I'm running two Prometheus replicas from a stateful set (deployed from Prometheus Operator Helm chart) with Thanos sidecar v0.5.0 and from time to time I end up with some overlapping blocks & Thanos Comapctor in CrashLoopBackoff.
As of now I don't have a lot of clues but I guess this has to be linked to the fact that I'm currently performing tests/updates on this Prometheus (notably adding/refactoring scrape configs) which leads to lots of Prometheus pod restarts.
Edit: I discovered an issue for the persistent storage of my Prometheus deployment. Now that this is fixed I don't have any Thanos Comapctor error / duplicated blocks.
Hi. Thanos Compactor is complaining about the same error and it references two blobs that, if I do a bucket inspect, they belong to the same prometheus replica.
| 01DDAT9V8SHHTAYJXNJPTKVSP6 | 14-06-2019 08:00:00 | 14-06-2019 10:00:00 | 2h0m0s | 38h0m0s | 345,454 | 81,584,495 | 684,780 | 1 | false | cluster=ci,env=ci,prometheus=monitoring/k8s,prometheus_replica=prometheus-k8s-0 | 0s | sidecar |
| 01DDAT9VYEZ4QTVRJC4NJBT27F | 14-06-2019 08:00:00 | 14-06-2019 10:00:00 | 2h0m0s | 38h0m0s | 58,675 | 13,924,099 | 116,993 | 1 | false | cluster=ci,env=ci,prometheus=monitoring/k8s,prometheus_replica=prometheus-k8s-0 | 0s | sidecar |
As you can see, there are two blobs for the same time and date and the same replica of prometheus with the same compression but different number of series, samples and chunks. We are using 0.3.2 and Prometheus 2.5.0 (from Prometheus-operator 0.30).
We have deleted all the blobs in the storage account (azure) and we are still getting the overlapping error. Has this been solved in newer versions?
Thank you
@bwplotka I just faced the same issue. Nothing special about my setup (2 Prometheus instance scrapping independently the same targets).
Thanos version : 0.6.0
I tried running thanos bucket verify --repair
but I got : msg="repair is not implemented for this issue" issue=overlapped_blocks
.
Any plans to implement a repair for overlapped_blocks
?
same here for thanos version 0.6.0
Experiencing this issue still in 0.6.x series.
Getting the same issue on the versions v0.8.1 and v0.9.0
We are having the same issue, running with 3 Prometheus statefulset pods.
Let me revisit this ticket again.
All known causes of overlaps are misconfiguraiton. W tried our best to explain all potential problems and solutions here: https://thanos.io/operating/troubleshooting.md/ . No automatic repair is possible in this case. (:
Super happy we have finally nice doc about that thanks to @daixiang0. Let's iterate over it if there is something missing there. :hugs:
@bwplotka I'd like to revisit this issue, if you have a moment. We have a very simple stack set up using this docker-compose
configuration. It has:
After running for just a couple of days, we're running into the "error executing compaction: compaction failed: compaction failed for group ...: pre compaction overlap check: overlaps found while gathering blocks." error.
The troubleshooting documents suggests the following reasons for this error:
Misconfiguraiton of sidecar/ruler: Same external labels or no external labels across many block producers.
We only have a single sidecar (and no ruler).
Running multiple compactors for single block “stream”, even for short duration.
We only have a single compactor.
Manually uploading blocks to the bucket.
This never happened.
Eventually consistent block storage until we fully implement RW for bucket
I'm not entirely sure what this is suggesting. Given only a single producer of information and a single bucket for storage, I'm not sure how eventual consistency could be a problem.
If you have a minute (or @daixiang0 or someone else) I would appreciate some insight into what could be causing this problem.
We're running:
$ docker-compose exec thanos_store thanos --version
thanos, version 0.12.0-dev (branch: master, revision: d18e1aec64ca3de143930a87d60bc52fe733e682)
build user: circleci@db15d248cc4d
build date: 20200304-16:25:49
go version: go1.13.1
With:
$ docker-compose exec prom_main prometheus --version
prometheus, version 2.16.0 (branch: HEAD, revision: b90be6f32a33c03163d700e1452b54454ddce0ec)
build user: root@7ea0ae865f12
build date: 20200213-23:50:02
go version: go1.13.8
Thanks for very clean write up! @larsks
Eventually consistent block storage until we fully implement RW for bucket
I'm not entirely sure what this is suggesting. Given only a single producer of information and a single bucket for storage, I'm not sure how eventual consistency could be a problem.
Well, I actually think that is the issue. I remember someone else was reporting that Swift is not strongly consistent. Eventual consistency actually creates thousands of issues. You have a single producer, yes, but imagine Compactor is creating a block. Then it removes the old block, because it just created block, right? So all good! So it deletes the block, and starts new iteration. Now we can have so many different cases:
Overall we spent so much time on a design solution that will work for kind of very rare case of eventual consistent storages... so trust us. We can with @squat and @khyatisoneji elaborate what more can be wrong on such cases... And in the end you can read more details here on was done and what is still planned: https://thanos.io/proposals/201901-read-write-operations-bucket.md/
Overall the new Thanos version will help you a lot, but still, there is an issue with compactor replicating blocks by accident on eventually consistent storages. We are missing this item: https://github.com/thanos-io/thanos/issues/2283
In the meantime I think you can try enabling vertical compaction. This will ensure that compactor will handle overlaps.. by simply compacting again into one. This is experimental though. cc @kakkoyun @metalmatze
Ideally, I would suggest using anything else than Swift, as other object storages have no issues like this.
Thanks for the response! I have a few questions for you:
Overall the new Thanos version will help you a lot,
Do you mean "a future version of Thanos" or do you mean we should simply upgrade to master
?
In the meantime I think you can try enabling vertical compaction.
Is that as simple as adding --deduplication.replica-label
to the thanos compactor invocation? "Vertical compaction" isn't mention explicitly anywhere other than in CHANGELOG.md
right now.
Ideally, I would suggest using anything else than Swift, as other object storages have no issues like this.
Is it okay to use filesystem-backed storage? It has all sorts of warnings in https://github.com/thanos-io/thanos/blob/master/docs/storage.md, but we're not interested in a paid option like GCS or S3, and we don't have a local S3-analog other than Swift. I guess we could set up something like MinIO, but requiring a separate storage service just for Thanos isn't a great option.
Do you mean "a future version of Thanos" or do you mean we should simply upgrade to master?
I mean v0.12.0-rc.1
. The one with @khyatisoneji delayed delete feature.
Is that as simple as adding --deduplication.replica-label to the thanos compactor invocation? "Vertical compaction" isn't mentioned explicitly anywhere other than in CHANGELOG.md right now.
Yes! In fact, you don't need a replica label, you can even put not-existed-I-don't-have-any
but this will make Compactor unblock to combine overlaps together instead of halting.. but this is kind of bug (cc @kakkoyun) than feature and we plan to start halting on unexpected overlaps as well.
:warning: :warning: This is not documented for a reason. We just added experimental support for it and it does NOT work for Prometheus HA replicas yet. Also if you experience block overlap and you just enable this: Be aware that this feature was not made for overlap but for offline deduplication, so we never thought or tested all the bad cases that can happen, because of it (: I just "think" I it might work as a side effect. Help wanted to experiment more with this. Never on production setup though :warning: :warning:
Is it okay to use filesystem-backed storage? It has all sorts of warnings in https://github.com/thanos-io/thanos/blob/master/docs/storage.md, but we're not interested in a paid option like GCS or S3, and we don't have a local S3-analog other than Swift. I guess we could set up something like MinIO, but requiring a separate storage service just for Thanos isn't a great option.
Really depends on your needs and amount of data. It has warnings, to avoid cases like users being too smart and running on NFS (: etc. If your data fits on disk and you are fine on manually backing up, resizing disk operations, then filesystem should do just fine. Please feel free to test it out, it is definitely production-grade tested and maintained. Can rephrase docs to state so.
I switched over to using filesystem-backed storage on 4/10, and it's survived the past several days without error. Looks like it was a storage issue, so hopefully things will be able for a bit now.
Yes let us know how it goes. I am pretty sure it should be stable, so we
can remove experimental
mention in the docs (:
Bartek
On Mon, 13 Apr 2020 at 16:29, Lars Kellogg-Stedman notifications@github.com wrote:
I switched over to using filesystem-backed storage on 4/10, and it's survived the past several days without error. Looks like it was a storage issue, so hopefully things will be able for a bit now.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/thanos-io/thanos/issues/469#issuecomment-612948871, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABVA3O3SBFKI24XKLH4DGXDRMMVWTANCNFSM4FOQQK7Q .
Hey everyone 👋🏼 I would like to explore the possibility of adding some sort of fixing command to the thanos tools bucket
group of subcommands to facilitate handling this kind of issues.
This is a very good question. We wanted the repair job to be a "whitebox" repair. So it is not repairing if it does not know to what exact issue it related. This is to avoid the problem of removing the block without thinking what exactly was wrong, which is important. Otherwise, how to fix the SOURCE of the issue?
and
All known causes of overlaps are misconfiguraiton. W tried our best to explain all potential problems and solutions here: https://thanos.io/operating/troubleshooting.md/ . No automatic repair is possible in this case. (:
@bwplotka Thanks for all the details shared on this issue! While I understand why the team decided to avoid fixing the problem without knowing the origin of the issue, in the end, users do need to act somehow to clean up the bucket so Thanos compactor can get back to work in the newly added data and everything else that isn't overlapping even if the root cause of the problem was a misconfiguration. Depending on the time window and the amount of data stored in the bucket, the effort required to clean it can get pretty big, forcing users to write their own scripts, and risking ending up in an even worse situation.
I would like to propose the possibility of adding a complementary command, or even evolving the current one (thanos tools bucket verify --repair) with the ability to move the affected blocks to a backup bucket, or something similar to that, in a way that users could get compactor back to running and then decide what to do with the affected data. We could also consider implementing a way to move the data back once it's sorted out 🤔
If that's something that makes sense for the project, I would love to explore the topic and contribute. I would appreciate some feedback and guidance on this (should I open a new issue?)
Thanks
@B0go that would be awesome. We experience this issue every few weeks on one of our clusters. Seems to happen randomly. We already got a dedicated alert in monitoring with runbook for what to delete. I guess we could just run the repair command as cronjob proactively in the future to prevent that kind of manual toil.
Thanos, Prometheus and Golang version used Docker images version: master-2018-08-04-8b7169b (Was also affecting an older version before)
What happened The thanos compactor is failing to run (crash)
What you expected to happen The thanos comptactor to compact :)
How to reproduce it (as minimally and precisely as possible): Good question, it was running before and when checking it on the server I found it restarting in loop (Because of the docker restart policy that is to restart on crash)
Full logs to relevant components
Related configuration from our DSC (Salt)
Let me know if you need any more information.