Open cmg1986 opened 1 year ago
Describe the bug One of the compactor getting crashed continuously which is compacting a bigger tenant block.
To Reproduce Steps to reproduce the behavior:
Expected behavior I expect the compaction process is running smoothly OR atleast it should not crash if there is an error too.
Environment:
Additional Context
`{"caller":"compact.go:1291","component":"compactor","level":"info","msg":"start sync of metas","org_id":"GC","ts":"2023-10-17T08:04:48.841909267Z"} {"caller":"fetcher.go:327","component":"block.BaseFetcher","concurrency":20,"level":"debug","msg":"fetching meta data","org_id":"GC","ts":"2023-10-17T08:04:48.842241761Z"} {"cached":1124,"caller":"fetcher.go:478","component":"block.BaseFetcher","duration":"1.404187325s","duration_ms":1404,"level":"info","msg":"successfully synchronized block metadata","org_id":"GC","partial":0,"returned":1123,"ts":"2023-10-17T08:04:50.246310867Z"} {"caller":"compact.go:1296","component":"compactor","level":"info","msg":"start of GC","org_id":"GC","ts":"2023-10-17T08:04:50.246405036Z"} {"caller":"compact.go:1319","component":"compactor","level":"info","msg":"start of compactions","org_id":"GC","ts":"2023-10-17T08:04:50.344171068Z"} {"caller":"compact.go:1005","component":"compactor","group":"0@{__org_id__=\"GC\"}","groupKey":"0@7253914978157373696","level":"info","msg":"compaction available and planned; downloading blocks","org_id":"GC","plan":"[01HCW6SYRH47K26KZS2H3K3H18 (min time: 1697414400000, max time: 1697457600000) 01HCW98DQ8Z3A1VF9ZF71SYJZX (min time: 1697450400000, max time: 1697457600000) 01HCW9AQXPMJ3NYDM32BRHA8YH (min time: 1697450400000, max time: 1697457600000) 01HCW9ARN3G0EJMAS4HB3Z8W3F (min time: 1697450400000, max time: 1697457600000) 01HCW9DF4ZGG25BRS09B7F74X9 (min time: 1697450400000, max time: 1697457600000) 01HCW9ANK1TBBD3NDEYW55MEAD (min time: 1697450400000, max time: 1697457600000) 01HCW94VNCB1A1M8S1FVCZWH65 (min time: 1697450400000, max time: 1697457600000) 01HCW9ABKN7SD08GW53CT8B4Q5 (min time: 1697450400000, max time: 1697457600000) 01HCW9A7P4NPB9F615KBD6TGZY (min time: 1697450400000, max time: 1697457600000) 01HCW9ANSN8858929MX4NP5839 (min time: 1697450400000, max time: 1697457600000) 01HCW9AWTXMJC433BS4S2EFMJB (min time: 1697450400000, max time: 1697457600000) 01HCW9AQD3T052P8BWNAMCCQ2D (min time: 1697450400000, max time: 1697457600000) 01HCW97R6N93ERRHQ1XFE06Q5E (min time: 1697450400000, max time: 1697457600000) 01HCW9D4BQRM32S1F6XAZ2C0QZ (min time: 1697450400000, max time: 1697457600000) 01HCW99NCXR8BEWJ04MJB4JRSF (min time: 1697450400000, max time: 1697457600000) 01HCW95E83727PJC00DP7WXX3T (min time: 1697450400000, max time: 1697457600000) 01HCW9AR1S71JZD4B79M1ADYYE (min time: 1697450400000, max time: 1697457600000) 01HCW94Y269VJ81V1GM37V1V3E (min time: 1697450400000, max time: 1697457600000) 01HCW954EP3QMMD0H0QYCAN2X7 (min time: 1697450400000, max time: 1697457600000) 01HCW95G5TXN2T60RWCG3NWXBH (min time: 1697450400000, max time: 1697457600000) 01HCW9AW5JVG0J20YW7037YSCF (min time: 1697450400000, max time: 1697457600000) 01HCW7J5V1Y4AES2VXQZ67JHW7 (min time: 1697450400000, max time: 1697457600000) 01HCW9AGP7AFBXPV0QG1Z9DAJN (min time: 1697450400000, max time: 1697457600000) 01HCW9APYVBEFJP9JBAM2WA4M0 (min time: 1697450400001, max time: 1697457600000) 01HCW957JWMX807P9W0P7B9YY7 (min time: 1697450400001, max time: 1697457600000) 01HCW9CNHZEVJS43N82XJXEP4J (min time: 1697450400001, max time: 1697457600000) 01HCW9D58PM10G7RGAK70CZ6A1 (min time: 1697450400001, max time: 1697457600000) 01HCW950D8AKTNBZGM9GN2HZM7 (min time: 1697450400001, max time: 1697457600000) 01HCW98QZ5Q9SMNZEZCFNJ4XN8 (min time: 1697450400001, max time: 1697457600000) 01HCW95AZ2R9CX4XYT88B1QHXS (min time: 1697450400001, max time: 1697457600000) 01HCW7XY0ZESYAGQVJ3MKDMQKX (min time: 1697450400001, max time: 1697457600000) 01HCW9AVRCS1CZW1VXBMPX7803 (min time: 1697450400001, max time: 1697457600000) 01HCW9DD3FW6KQ841P27KJK5D8 (min time: 1697450400001, max time: 1697457600000) 01HCW9BWNJ9Z98DCD42CN92VRZ (min time: 1697450400001, max time: 1697457600000) 01HCW99Y71N7DHP38XP2FWZBE4 (min time: 1697450400001, max time: 1697457600000) 01HCW9DMC30C8ZVA16NHK4S2RB (min time: 1697450400001, max time: 1697457600000) 01HCW9AWVA3VJXG8CP6SPS1GT8 (min time: 1697450400001, max time: 1697457600000) 01HCW7P0P6KPJN24KZ6CS69XA2 (min time: 1697450400001, max time: 1697457600000) 01HCW95ASC3N3PQ50SFF999XXS (min time: 1697450400001, max time: 1697457600000) 01HCW9AMJDK4FVFCSYA961S9DW (min time: 1697450400001, max time: 1697457600000) 01HCW953R3T6059ZGWHHXXGMWB (min time: 1697450400001, max time: 1697457600000) 01HCW955ZYPHMM71EXS4CCCNPK (min time: 1697450400001, max time: 1697457600000) 01HCW8AFZW1JHJDT89ZQQ2H9G6 (min time: 1697450400001, max time: 1697457600000) 01HCW94YTXY53B5YCNAMXAHDSQ (min time: 1697450400001, max time: 1697457600000) 01HCW9B5D7BPFY06YJSBN9B5KA (min time: 1697450400001, max time: 1697457600000) 01HCW9DSZXTMS823J18NH66AQM (min time: 1697450400001, max time: 1697457600000) 01HCW99FXNGW1C9WS6WBQDCC3K (min time: 1697450400002, max time: 1697457600000) 01HCW9D43HHR6GA4MARC1Z7J2M (min time: 1697450400002, max time: 1697457600000) 01HCW9AH0HYD58VTWGG4S5EX99 (min time: 1697450400002, max time: 1697457600000) 01HCW9AMNZ5C7X5GB1EXG6NJ6R (min time: 1697450400002, max time: 1697457600000) 01HCW98FVQ9RHCBJ9ZBYPSMSSV (min time: 1697450400002, max time: 1697457600000) 01HCW9CAWMRX7YC91BMS59A12E (min time: 1697450400002, max time: 1697457600000) 01HCW9B4BET2CDAKS0A09Q9AJ8 (min time: 1697450400002, max time: 1697457600000) 01HCW96GKHN26HVQHJZ1HQEA3P (min time: 1697450400003, max time: 1697457600000) 01HCW9DE1MZXD7G5ZD3HVNSJ47 (min time: 1697450400003, max time: 1697457600000) 01HCW843AWYT8HQG1VGAW35FQK (min time: 1697450400003, max time: 1697457600000) 01HCW9BREG7X0VT5G1TGYK58QA (min time: 1697450400003, max time: 1697457600000) 01HCW9ARM74A21RQ5BKYXE01YS (min time: 1697450400004, max time: 1697457600000) 01HCW9BRFJV2C94ZJCCJ7HXWT5 (min time: 1697450400004, max time: 1697457600000) 01HCW97ATEXAANENGD472D0EBB (min time: 1697450400004, max time: 1697457600000) 01HCW9547581FW2P1PHTFQ4W9K (min time: 1697450400004, max time: 1697457600000) 01HCW9CVH17X1TTCPJ245XKVQD (min time: 1697450400004, max time: 1697457600000) 01HCW95528PDASMJ5MKVGYHCFQ (min time: 1697450400004, max time: 1697457600000) 01HCW9B5CAD1K9VR0H2EH5RR96 (min time: 1697450400006, max time: 1697457600000) 01HCW9AH4HQBTQM9CMGBMDC8Y8 (min time: 1697450400006, max time: 1697457600000) 01HCW9BWP8AWSFN8MH4P3SYR3C (min time: 1697450400006, max time: 1697457600000) 01HCW95T2VK2CXR65BTQD3XYR2 (min time: 1697450400006, max time: 1697457600000) 01HCW9DVMVAFPXDV4N06GJG4WP (min time: 1697450400006, max time: 1697457600000) 01HCW950VCN1ZBW4TVH30CEPVD (min time: 1697450400006, max time: 1697457600000) 01HCW9B3SAKJEQ1DZRE93MBZCF (min time: 1697450400006, max time: 1697457600000) 01HCW9BB6SX02BF8QE5CT2QBTH (min time: 1697450400006, max time: 1697457600000) 01HCW959YP7JRHPT2ZT3GBYZ3V (min time: 1697450400009, max time: 1697457600000) 01HCW9DRNX544RNFF29C68FT4M (min time: 1697450400009, max time: 1697457600000) 01HCW96AB3CSZ2K77EQMGSAHD6 (min time: 1697450400010, max time: 1697457600000) 01HCWA6VQ5698J86B2NSQVV9WW (min time: 1697452876670, max time: 1697457600000) 01HCWA1RR752S54GJJZX2ZNRKR (min time: 1697452884390, max time: 1697457600000) 01HCW9JHHTZQPD436C8E5WMJFC (min time: 1697452886569, max time: 1697457600000) 01HCWAZM8G2RRKZTEGQ8WG6T0Y (min time: 1697452886758, max time: 1697457600000) 01HCWAHE4K84ZE3S7NXN1E3SHK (min time: 1697452886872, max time: 1697457600000) 01HCW9Q7CTFV22CCYBY8EPETEG (min time: 1697452886972, max time: 1697457600000) 01HCW9WKR852BHJ0ZYQ5790RFZ (min time: 1697452888899, max time: 1697457600000) 01HCWAB5J0897T9VJ26SA36JEV (min time: 1697452888899, max time: 1697457600000) 01HCWAV8FH3R00RQ744VXKEBG9 (min time: 1697452891793, max time: 1697457600000) 01HCWAPP4P47H6GJTE57BQHZ00 (min time: 1697452891793, max time: 1697457600000) 01HCW8DWXHTH4X6WEGJMAMWK5A (min time: 1697452997619, max time: 1697457600000) 01HCW90AKHZENW3V8TBMP9BNB0 (min time: 1697453001867, max time: 1697457600000) 01HCW7TTM6RV53EZ3SSMQ911JK (min time: 1697453060709, max time: 1697457600000) 01HCW7TTM06F8T792ZNGD076EY (min time: 1697453060709, max time: 1697457600000) 01HCWB59HYXASVYR7MH5PJZKCC (min time: 1697456120833, max time: 1697457600000) 01HCWBBHSAXY9D96NQWZATP0A6 (min time: 1697456463481, max time: 1697457600000) 01HCWBKC1F9GC9CMM2YMCPQC6E (min time: 1697456605750, max time: 1697457600000) 01HCWBSSFF9473CPJRGKCR2P2X (min time: 1697456760770, max time: 1697457600000) 01HCWBZWPT5WGKNBZHZARDA3GT (min time: 1697456929949, max time: 1697457600000) 01HCWC7KWQB4MPXM4VTZ9RA6AG (min time: 1697457021954, max time: 1697457600000) 01HCWCDR19J0HG9NG3QPB63Y7Y (min time: 1697457299399, max time: 1697457600000) 01HCWCKGTA82F9CBRYRFKXPQAR (min time: 1697457470943, max time: 1697457600000)]","ts":"2023-10-17T08:04:50.43689535Z"} {"caller":"objstore.go:361","component":"compactor","file":"01HCW9AQXPMJ3NYDM32BRHA8YH/meta.json","group":"0@{__org_id__=\"GC\"}","groupKey":"0@7253914978157373696","level":"debug","msg":"not downloading again because a provided path matches this one","org_id":"GC","ts":"2023-10-17T08:04:50.449742708Z"} {"caller":"objstore.go:361","component":"compactor","file":"01HCW98DQ8Z3A1VF9ZF71SYJZX/meta.json","group":"0@{__org_id__=\"GC\"}","groupKey":"0@7253914978157373696","level":"debug","msg":"not downloading again because a provided path matches this one","org_id":"GC","ts":"2023-10-17T08:04:50.453705345Z"} {"caller":"objstore.go:361","component":"compactor","file":"01HCW6SYRH47K26KZS2H3K3H18/meta.json","group":"0@{__org_id__=\"GC\"}","groupKey":"0@7253914978157373696","level":"debug","msg":"not downloading again because a provided path matches this one","org_id":"GC","ts":"2023-10-17T08:04:50.453822594Z"} unexpected fault address 0x7ffef52a17ad fatal error: fault [signal SIGBUS: bus error code=0x2 addr=0x7ffef52a17ad pc=0x9bca58] goroutine 1792576 [running]: runtime.throw({0x2863df0?, 0xc001b08fe0?}) /usr/local/go/src/runtime/panic.go:1047 +0x5d fp=0xc001b08f70 sp=0xc001b08f40 pc=0x43907d runtime.sigpanic() /usr/local/go/src/runtime/signal_unix.go:834 +0x125 fp=0xc001b08fd0 sp=0xc001b08f70 pc=0x44fde5 github.com/dennwc/varint.Uvarint({0x7ffef52a17ad?, 0xc001b09010?, 0x455339?}) /__w/cortex/cortex/vendor/github.com/dennwc/varint/varint.go:75 +0x18 fp=0xc001b08fd8 sp=0xc001b08fd0 pc=0x9bca58 github.com/prometheus/prometheus/tsdb/encoding.(*Decbuf).Uvarint64(0xc001b09060) /__w/cortex/cortex/vendor/github.com/prometheus/prometheus/tsdb/encoding/encoding.go:242 +0x3e fp=0xc001b09000 sp=0xc001b08fd8 pc=0xce817e github.com/prometheus/prometheus/tsdb/encoding.(*Decbuf).UvarintBytes(0xc001b09060) /__w/cortex/cortex/vendor/github.com/prometheus/prometheus/tsdb/encoding/encoding.go:206 +0x25 fp=0xc001b09020 sp=0xc001b09000 pc=0xce7f25 github.com/prometheus/prometheus/tsdb/index.Symbols.Lookup({{0x2e8a720, 0xc001cda000}, 0x2, 0x5, {0xc005280000, 0x14a50, 0x14a50}, 0x2949f1}, 0x1018) /__w/cortex/cortex/vendor/github.com/prometheus/prometheus/tsdb/index/index.go:1316 +0x245 fp=0xc001b09098 sp=0xc001b09020 pc=0xcf29c5 github.com/prometheus/prometheus/tsdb/index.(*Reader).lookupSymbol(0xc00062e360, 0x428f25?) /__w/cortex/cortex/vendor/github.com/prometheus/prometheus/tsdb/index/index.go:1447 +0xd8 fp=0xc001b09138 sp=0xc001b09098 pc=0xcf3818 github.com/prometheus/prometheus/tsdb/index.(*Reader).lookupSymbol-fm(0x1b09210?) <autogenerated>:1 +0x2b fp=0xc001b09158 sp=0xc001b09138 pc=0xcfdceb github.com/prometheus/prometheus/tsdb/index.(*Decoder).Series(0xc00102e000, {0x7fff2ef9c7c2?, 0x39d437c0?, 0x7ffff7fbd108?}, 0xc001b09570, 0xc001b09558) /__w/cortex/cortex/vendor/github.com/prometheus/prometheus/tsdb/index/index.go:1861 +0x151 fp=0xc001b092c8 sp=0xc001b09158 pc=0xcf63d1 github.com/prometheus/prometheus/tsdb/index.(*Reader).Series(0xc00062e360, 0x39d437c, 0xc001b09570?, 0xc00364b380?) /__w/cortex/cortex/vendor/github.com/prometheus/prometheus/tsdb/index/index.go:1611 +0x127 fp=0xc001b09378 sp=0xc001b092c8 pc=0xcf49c7 github.com/thanos-io/thanos/pkg/block.GatherIndexHealthStats({0x2e7a3a0, 0xc00013d770}, {_, _}, _, _) /__w/cortex/cortex/vendor/github.com/thanos-io/thanos/pkg/block/index.go:255 +0x566 fp=0xc001b09660 sp=0xc001b09378 pc=0x1972306 github.com/thanos-io/thanos/pkg/compact.(*Group).compact.func2.1.2({0x2e95968?, 0xc00013c140?}) /__w/cortex/cortex/vendor/github.com/thanos-io/thanos/pkg/compact/compact.go:1037 +0xdd fp=0xc001b09978 sp=0xc001b09660 pc=0x1dd26fd github.com/thanos-io/thanos/pkg/tracing.DoInSpanWithErr({0x2e95968?, 0xc00013c140?}, {0x28a1f68?, 0x8?}, 0xc001b09f38, {0xc0013d8020?, 0xc003443b00?, 0x416050?}) /__w/cortex/cortex/vendor/github.com/thanos-io/thanos/pkg/tracing/tracing.go:82 +0xd0 fp=0xc001b09a18 sp=0xc001b09978 pc=0x1ccad10 github.com/thanos-io/thanos/pkg/compact.(*Group).compact.func2.1() /__w/cortex/cortex/vendor/github.com/thanos-io/thanos/pkg/compact/compact.go:1036 +0x45c fp=0xc001b09f78 sp=0xc001b09a18 pc=0x1dd203c golang.org/x/sync/errgroup.(*Group).Go.func1() /__w/cortex/cortex/vendor/golang.org/x/sync/errgroup/errgroup.go:75 +0x64 fp=0xc001b09fe0 sp=0xc001b09f78 pc=0xd012c4 runtime.goexit() /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc001b09fe8 sp=0xc001b09fe0 pc=0x46fd61 created by golang.org/x/sync/errgroup.(*Group).Go /__w/cortex/cortex/vendor/golang.org/x/sync/errgroup/errgroup.go:72 +0xa5 goroutine 1 [select, 810 minutes]: runtime.gopark(0xc000dbc068?, 0x2?, 0xe8?, 0xa4?, 0xc000dbc064?) /usr/local/go/src/runtime/proc.go:381 +0xd6 fp=0xc0017fbef0 sp=0xc0017fbed0 pc=0x43bd96 runtime.selectgo(0xc0017fc068, 0xc000dbc060, 0x16374e0?, 0x0, 0x2e959a0?, 0x1) /usr/local/go/src/runtime/select.go:327 +0x7be fp=0xc0017fc030 sp=0xc0017fbef0 pc=0x44c03e github.com/cortexproject/cortex/pkg/util/services.(*Manager).AwaitStopped(0xc001262720, {0x2e959a0, 0xc00007a028}) /__w/cortex/cortex/pkg/util/services/manager.go:145 +0x6d fp=0xc0017fc098 sp=0xc0017fc030 pc=0x1639dcd github.com/cortexproject/cortex/pkg/cortex.(*Cortex).Run(0xc000e60000) /__w/cortex/cortex/pkg/cortex/cortex.go:459 +0x925 fp=0xc0017fc260 sp=0xc0017fc098 pc=0x2140145 main.main() /__w/cortex/cortex/cmd/cortex/main.go:196 +0xdf0 fp=0xc0017fff80 sp=0xc0017fc260 pc=0x214d710 runtime.main() /usr/local/go/src/runtime/proc.go:250 +0x207 fp=0xc0017fffe0 sp=0xc0017fff80 pc=0x43b967 runtime.goexit() /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc0017fffe8 sp=0xc0017fffe0 pc=0x46fd61 goroutine 2 [force gc (idle), 810 minutes]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) /usr/local/go/src/runtime/proc.go:381 +0xd6 fp=0xc000110fb0 sp=0xc000110f90 pc=0x43bd96 runtime.goparkunlock(...) /usr/local/go/src/runtime/proc.go:387 runtime.forcegchelper() /usr/local/go/src/runtime/proc.go:305 +0xb0 fp=0xc000110fe0 sp=0xc000110fb0 pc=0x43bbd0 runtime.goexit() /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc000110fe8 sp=0xc000110fe0 pc=0x46fd61 created by runtime.init.6 /usr/local/go/src/runtime/proc.go:293 +0x25 goroutine 3 [GC sweep wait]: runtime.gopark(0x42c5f01?, 0x0?, 0x0?, 0x0?, 0x0?) /usr/local/go/src/runtime/proc.go:381 +0xd6 fp=0xc000111780 sp=0xc000111760 pc=0x43bd96 runtime.goparkunlock(...) /usr/local/go/src/runtime/proc.go:387 runtime.bgsweep(0x0?) /usr/local/go/src/runtime/mgcsweep.go:319 +0xde fp=0xc0001117c8 sp=0xc000111780 pc=0x425e3e runtime.gcenable.func1() /usr/local/go/src/runtime/mgc.go:178 +0x26 fp=0xc0001117e0 sp=0xc0001117c8 pc=0x41b0a6 runtime.goexit() /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc0001117e8 sp=0xc0001117e0 pc=0x46fd61 created by runtime.gcenable /usr/local/go/src/runtime/mgc.go:178 +0x6b`
Complete Stack trace is available here - https://slack-files.com/T08PSQ7BQ-F062AEM0RA4-72eb738ee7
@cmg1986 What is the size of index of each source blocks? And what compaction level of each source blocks?
Describe the bug One of the compactor getting crashed continuously which is compacting a bigger tenant block.
To Reproduce Steps to reproduce the behavior:
Expected behavior I expect the compaction process is running smoothly OR atleast it should not crash if there is an error too.
Environment:
Additional Context
Complete Stack trace is available here - https://slack-files.com/T08PSQ7BQ-F062AEM0RA4-72eb738ee7