Closed kyle-a-wong closed 3 days ago
Describe the problem
Please describe the issue you observed, and any steps we can take to reproduce it:
in v23.2.10, operating on an unfinalized cluster and will see node restarts because of an NPE (Stack trace below):
v23.2.10
github.com/cockroachdb/cockroach/pkg/server.(*systemStatusServer).spanStatsFanOut.func3(0x3b9bc28?, {0x60d27e0?, 0xc1cc24ae60?}) github.com/cockroachdb/cockroach/pkg/server/span_stats_server.go:116 +0x114 github.com/cockroachdb/cockroach/pkg/server.(*statusServer).iterateNodes(0xc020471780, {0x7a7a7e8, 0xc033fa4930}, {0x640986c, 0x1e}, 0xdf8475800, 0xc0a72bb650, 0xc0a72bb668, 0xc003b9bde0, 0xc003b9bdf8) github.com/cockroachdb/cockroach/pkg/server/status.go:3141 +0x557 github.com/cockroachdb/cockroach/pkg/server.(*systemStatusServer).spanStatsFanOut(0xc02040a8c0, {0x7a7a7e8?, 0xc033fa4930}, 0xc3423e83c0) github.com/cockroachdb/cockroach/pkg/server/span_stats_server.go:138 +0x41b github.com/cockroachdb/cockroach/pkg/server.(*systemStatusServer).getSpanStatsInternal(0x594c2a0?, {0x7a7a7e8, 0xc033fa4930}, 0xc3423e83c0) github.com/cockroachdb/cockroach/pkg/server/span_stats_server.go:287 +0x38 github.com/cockroachdb/cockroach/pkg/server.batchedSpanStats({0x7a7a7e8, 0xc033fa4930}, 0xc3423e83c0, 0xc003b9c108, 0x3e8) github.com/cockroachdb/cockroach/pkg/server/span_stats_server.go:464 +0x2ce github.com/cockroachdb/cockroach/pkg/server.(*systemStatusServer).SpanStats(0xc02040a8c0, {0x7a7a7e8?, 0xc033fa47e0?}, 0x7a7a7e8?) github.com/cockroachdb/cockroach/pkg/server/status.go:3673 +0x127 github.com/cockroachdb/cockroach/pkg/sql.(*planner).SpanStats(0xc1997b6670, {0x7a7a7e8, 0xc033fa47e0}, {0xc0076f0000, 0x1d684, 0x21155}) github.com/cockroachdb/cockroach/pkg/sql/planner.go:943 +0xb1 github.com/cockroachdb/cockroach/pkg/sql/sem/builtins.(*spanStatsValueGenerator).Start(0xc300bcf0a0, {0x7a7a7e8?, 0xc033fa47e0?}, 0x61b8680?) github.com/cockroachdb/cockroach/pkg/sql/sem/builtins/generator_builtins.go:3473 +0x3c
This code hasn't been changed since v23.2.10, so it seems likely that this some NPE can occur in more recent versions as well
To Reproduce
// TODO
Additional data / screenshots
The stack trace above points to this line in the code: https://github.com/cockroachdb/cockroach/blob/c68c559859be738efead9971f5e11f62a8c69d06/pkg/server/span_stats_server.go#L147
so the 2 suspected culprits for the NPE are res.SpanToStats[spanStr] and spanStats.TotalStats
res.SpanToStats[spanStr]
spanStats.TotalStats
Environment: The issue was experienced in v23.2.10, but it is likely that this bug exists in the main branch as well.
Additional context What was the impact? Nodes were restarting due to panics from the NPE
Jira issue: CRDB-42842
Hi @kyle-a-wong, please add branch-* labels to identify which branch(es) this C-bug affects.
:owl: Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.
Describe the problem
Please describe the issue you observed, and any steps we can take to reproduce it:
in
v23.2.10
, operating on an unfinalized cluster and will see node restarts because of an NPE (Stack trace below):This code hasn't been changed since
v23.2.10
, so it seems likely that this some NPE can occur in more recent versions as wellTo Reproduce
// TODO
Additional data / screenshots
The stack trace above points to this line in the code: https://github.com/cockroachdb/cockroach/blob/c68c559859be738efead9971f5e11f62a8c69d06/pkg/server/span_stats_server.go#L147
so the 2 suspected culprits for the NPE are
res.SpanToStats[spanStr]
andspanStats.TotalStats
Environment: The issue was experienced in v23.2.10, but it is likely that this bug exists in the main branch as well.
Additional context What was the impact? Nodes were restarting due to panics from the NPE
Jira issue: CRDB-42842