Closed ChaoHsupin closed 2 years ago
Hello,
Have you try to use:
--wait-interval=5m
Wait interval between consecutive compaction runs and bucket refreshes. Only works when --wait flag specified.
Hello, Have you try to use:
--wait-interval=5m
Wait interval between consecutive compaction runs and bucket refreshes. Only works when --wait flag specified.
Thanks reply!
I tried adding this parameter --wait-interval=5m
,but useless.
There are still a large number of duplicate logs as follows:
level=info ts=2022-05-05T15:57:30.310923964Z caller=blocks_cleaner.go:58 msg="cleaning of blocks marked for deletion done"
level=info ts=2022-05-05T15:57:42.20936612Z caller=fetcher.go:470 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=10.163160885s duration_ms=10163 cached=13960 returned=13960 partial=1
level=info ts=2022-05-05T15:58:42.229451273Z caller=fetcher.go:470 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=10.183314864s duration_ms=10183 cached=13960 returned=13960 partial=1
level=info ts=2022-05-05T15:59:42.219694111Z caller=fetcher.go:470 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=10.173728222s duration_ms=10173 cached=13960 returned=13960 partial=1
level=info ts=2022-05-05T16:00:42.447690898Z caller=fetcher.go:470 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=10.40206302s duration_ms=10402 cached=13960 returned=13960 partial=1
level=info ts=2022-05-05T16:01:42.691945272Z caller=fetcher.go:470 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=10.646627619s duration_ms=10646 cached=13960 returned=13960 partial=1
level=info ts=2022-05-05T16:02:17.940555585Z caller=fetcher.go:470 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=13.620869519s duration_ms=13620 cached=13960 returned=1227 partial=1
level=info ts=2022-05-05T16:02:17.95942226Z caller=clean.go:34 msg="started cleaning of aborted partial uploads"
level=info ts=2022-05-05T16:02:17.95945469Z caller=clean.go:61 msg="cleaning of aborted partial uploads done"
level=info ts=2022-05-05T16:02:17.959464355Z caller=blocks_cleaner.go:44 msg="started cleaning of blocks marked for deletion"
level=info ts=2022-05-05T16:02:17.959559292Z caller=blocks_cleaner.go:58 msg="cleaning of blocks marked for deletion done"
level=info ts=2022-05-05T16:02:31.59749299Z caller=fetcher.go:470 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=13.637959951s duration_ms=13637 cached=13960 returned=1227 partial=1
level=info ts=2022-05-05T16:02:42.722209006Z caller=fetcher.go:470 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=10.676682571s duration_ms=10676 cached=13960 returned=13960 partial=1
level=info ts=2022-05-05T16:03:42.712906688Z caller=fetcher.go:470 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=10.666756324s duration_ms=10666 cached=13960 returned=13960 partial=1
level=info ts=2022-05-05T16:04:42.570681982Z caller=fetcher.go:470 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=10.524455654s duration_ms=10524 cached=13960 returned=13960 partial=1
level=info ts=2022-05-05T16:05:42.846779864Z caller=fetcher.go:470 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=10.801424978s duration_ms=10801 cached=13960 returned=13960 partial=2
level=info ts=2022-05-05T16:07:03.456053746Z caller=fetcher.go:470 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=31.410013001s duration_ms=31410 cached=13960 returned=13960 partial=2
level=info ts=2022-05-05T16:07:17.633996367Z caller=fetcher.go:470 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=13.31411497s duration_ms=13314 cached=13960 returned=1227 partial=2
level=info ts=2022-05-05T16:07:30.984159971Z caller=fetcher.go:470 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=13.330225285s duration_ms=13330 cached=13960 returned=1227 partial=2
@BegoniaGit Is the compactor currently in halted status? You can check it from the log or using the compactor metric thanos_compact_halted
@yeya24 Thank you very much! The mission did stoped. Now there is no relevant error log, I feel very confused. Could you give me some suggestions for further investigation?
# HELP thanos_compact_halted Set to 1 if the compactor halted due to an unexpected error.
# TYPE thanos_compact_halted gauge
thanos_compact_halted 1
@yeya24 Thank you very much! The mission did stoped. Now there is no relevant error log, I feel very confused. Could you give me some suggestions for further investigation?
# HELP thanos_compact_halted Set to 1 if the compactor halted due to an unexpected error. # TYPE thanos_compact_halted gauge thanos_compact_halted 1
I think there must be some logs indicating the reason of halting. Maybe you can restart the compactor and it will hit the halted error again. THen you can take a look at why.
@yeya24 Thank you! Restart later, how the thanos_compact_halted to 1 , get some details:
level=error ts=2022-05-15T11:41:18.413393736Z caller=compact.go:495 msg="critical error detected; halting" err="compaction: group 0@10665639288258739055: compact blocks [/var/thanos/compactor/thanos-compactor-01/compact/0@10665639288258739055/01G292HXWFQY79P88N50WATF0M /var/thanos/compactor/thanos-compactor-01/compact/0@10665639288258739055/01G299DN46V8XFSC10744SRXMX /var/thanos/compactor/thanos-compactor-01/compact/0@10665639288258739055/01G29G9CC6JCSJ64PKEK363P2S /var/thanos/compactor/thanos-compactor-01/compact/0@10665639288258739055/01G29Q53N8T3Q4SHEXNN4NTMN9]: 2 errors: populate block: write chunks: preallocate: no space left on device; sync /var/thanos/compactor/thanos-compactor-01/compact/0@10665639288258739055/01G33QXXJXNF9ZZD5WCNK0XYHV.tmp-for-creation/chunks/000002: file already closed"
At present, the disk is emptied first, then adjust the compression concurrency to 1, and then further observed.
@yeya24 Thank you! Restart later, how the thanos_compact_halted to 1 , get some details:
level=error ts=2022-05-15T11:41:18.413393736Z caller=compact.go:495 msg="critical error detected; halting" err="compaction: group 0@10665639288258739055: compact blocks [/var/thanos/compactor/thanos-compactor-01/compact/0@10665639288258739055/01G292HXWFQY79P88N50WATF0M /var/thanos/compactor/thanos-compactor-01/compact/0@10665639288258739055/01G299DN46V8XFSC10744SRXMX /var/thanos/compactor/thanos-compactor-01/compact/0@10665639288258739055/01G29G9CC6JCSJ64PKEK363P2S /var/thanos/compactor/thanos-compactor-01/compact/0@10665639288258739055/01G29Q53N8T3Q4SHEXNN4NTMN9]: 2 errors: populate block: write chunks: preallocate: no space left on device; sync /var/thanos/compactor/thanos-compactor-01/compact/0@10665639288258739055/01G33QXXJXNF9ZZD5WCNK0XYHV.tmp-for-creation/chunks/000002: file already closed"
At present, the disk is emptied first, then adjust the compression concurrency to 1, and then further observed.
So looks like you need more disk space.
Due to lack of response, I will close this one. Feel free to reopen it if you still see this issue
Thanos, Prometheus and Golang version used: thanos: v0.21.1
Object Storage Provider: cos What happened: Compactor only get meta info, does not compress, and not down sampling. k8s deploy args:
container logs, I can't see a log about starting compression and downsampling:
Environment: