kubecost / features-bugs

A public repository for filing of Kubecost feature requests and bugs. Please read the issue guidelines before filing an issue here.
0 stars 0 forks source link

[Bug] Aggregator pod crashes intermittently #58

Closed tabossert closed 8 months ago

tabossert commented 9 months ago

Kubecost Helm Chart Version

2.0.2

Kubernetes Version

1.27

Kubernetes Platform

AKS

Description

Intermittently the kubecost pod restarts, due to an error in the aggregator pod as seen below

We have tuned resources as much as possible so it doesn't seem to be related to OOM or disk slowness.

Steps to reproduce

  1. Leave kubecost running in cluster, wait to see when it restarts

Expected behavior

Pod would not be restarting

Impact

Our scripts to pull data out fail when this happens

Screenshots

No response

Logs

│ aggregator goroutine 1144450 [chan send]:                                                                                                                                                │
│ aggregator runtime.gopark(0x1?, 0xc06b7ec400?, 0x50?, 0x1?, 0x41?)                                                                                                                       │
│ aggregator     /usr/local/go/src/runtime/proc.go:398 +0xce fp=0xc052fd7d80 sp=0xc052fd7d60 pc=0x651cee                                                                                   │
│ aggregator runtime.chansend(0xc007491b60, 0xc052fd7f10, 0x1, 0xc01aa55f20?)                                                                                                              │
│ aggregator     /usr/local/go/src/runtime/chan.go:259 +0x3a5 fp=0xc052fd7df0 sp=0xc052fd7d80 pc=0x61c445                                                                                  │
│ aggregator runtime.chansend1(0x4e38c19?, 0x16?)                                                                                                                                          │
│ aggregator     /usr/local/go/src/runtime/chan.go:145 +0x17 fp=0xc052fd7e20 sp=0xc052fd7df0 pc=0x61c097                                                                                   │
│ aggregator github.com/kubecost/kubecost-cost-model/pkg/duckdb/allocation/db.(*AllocationDBQueryService).QueryAllocations.func1.1(0xc049aa7b80, {0xc049aa1600, 0xc, 0x10}, {0x80cab80, 0x │
│ aggregator     /app/kubecost-cost-model/pkg/duckdb/allocation/db/allocationqueryservice.go:1351 +0x2ee fp=0xc052fd7f58 sp=0xc052fd7e20 pc=0x2d9af6e                                      │
│ aggregator github.com/kubecost/kubecost-cost-model/pkg/duckdb/allocation/db.(*AllocationDBQueryService).QueryAllocations.func1.5()                                                       │
│ aggregator     /app/kubecost-cost-model/pkg/duckdb/allocation/db/allocationqueryservice.go:1380 +0x91 fp=0xc052fd7fe0 sp=0xc052fd7f58 pc=0x2d9ac31                                       │
│ aggregator runtime.goexit()                                                                                                                                                              │
│ aggregator     /usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc052fd7fe8 sp=0xc052fd7fe0 pc=0x684b81                                                                               │
│ aggregator created by github.com/kubecost/kubecost-cost-model/pkg/duckdb/allocation/db.(*AllocationDBQueryService).QueryAllocations.func1 in goroutine 796328                            │
│ aggregator     /app/kubecost-cost-model/pkg/duckdb/allocation/db/allocationqueryservice.go:1338 +0x377

Slack discussion

No response

Troubleshooting

AjayTripathy commented 9 months ago

cc @cliffcolvin can you take a look here? Has this been fixed in the upcoming 2.1 rc's?

chipzoller commented 9 months ago

Transferred.

cliffcolvin commented 9 months ago

We're taking a look right now.

michaelmdresser commented 9 months ago

@tabossert do you have any further log context from this crash? About 5 lines after and 15-20 lines preceding would help me here.

tabossert commented 9 months ago

`goroutine 2513173 [runnable]: runtime.cgocall(0x3225fb0, 0xc0330cb230) /usr/local/go/src/runtime/cgocall.go:157 +0x4b fp=0xc0330cb208 sp=0xc0330cb1d0 pc=0x61adcb github.com/marcboeker/go-duckdb._Cfunc_duckdb_execute_pending(0x7f641502d250, 0xc073047f80) _cgo_gotypes.go:1180 +0x4b fp=0xc0330cb230 sp=0xc0330cb208 pc=0x2d51f4b github.com/marcboeker/go-duckdb.(stmt).execute.func7(0x0?, 0x80cab01?) /go/pkg/mod/github.com/marcboeker/go-duckdb@v1.5.5/statement.go:225 +0x65 fp=0xc0330cb270 sp=0xc0330cb230 pc=0x2d5d085 github.com/marcboeker/go-duckdb.(stmt).execute(0xc05877ffb0, {0x5b5dcd8, 0xc088c8d830}, {0x80cab80?, 0x8?, 0x7f64e8088060?}) /go/pkg/mod/github.com/marcboeker/go-duckdb@v1.5.5/statement.go:225 +0x248 fp=0xc0330cb320 sp=0xc0330cb270 pc=0x2d5cca8 github.com/marcboeker/go-duckdb.(stmt).QueryContext(0xc05877ffb0, {0x5b5dcd8?, 0xc088c8d830?}, {0x80cab80?, 0x0?, 0x176?}) /go/pkg/mod/github.com/marcboeker/go-duckdb@v1.5.5/statement.go:175 +0x34 fp=0xc0330cb398 sp=0xc0330cb320 pc=0x2d5c994 github.com/marcboeker/go-duckdb.(conn).QueryContext(0xc059b77860, {0x5b5dcd8, 0xc088c8d830}, {0xc084129000, 0x18a}, {0x80cab80, 0x0, 0x0}) /go/pkg/mod/github.com/marcboeker/go-duckdb@v1.5.5/connection.go:96 +0x30a fp=0xc0330cb468 sp=0xc0330cb398 pc=0x2d53cca database/sql.ctxDriverQuery({0x5b5dcd8?, 0xc088c8d830?}, {0x7f64ec70f130?, 0xc059b77860?}, {0x0?, 0x0?}, {0xc084129000?, 0xc084129000?}, {0x80cab80, 0x0, ...}) /usr/local/go/src/database/sql/ctxutil.go:48 +0xd7 fp=0xc0330cb4f0 sp=0xc0330cb468 pc=0x16eb8d7 database/sql.(DB).queryDC.func1() /usr/local/go/src/database/sql/sql.go:1748 +0x165 fp=0xc0330cb5b0 sp=0xc0330cb4f0 pc=0x16f3b65 database/sql.withLock({0x5b44ec8, 0xc0723075f0}, 0xc0330cb708) /usr/local/go/src/database/sql/sql.go:3502 +0x82 fp=0xc0330cb5f0 sp=0xc0330cb5b0 pc=0x16fb6c2 database/sql.(DB).queryDC(0x1?, {0x5b5dcd8?, 0xc088c8d830}, {0x0, 0x0}, 0xc0723075f0, 0xc05589dd50, {0xc084129000, 0x18a}, {0x0, ...}) /usr/local/go/src/database/sql/sql.go:1743 +0x209 fp=0xc0330cb798 sp=0xc0330cb5f0 pc=0x16f34e9 database/sql.(DB).query(0x0?, {0x5b5dcd8, 0xc088c8d830}, {0xc084129000, 0x18a}, {0x0, 0x0, 0x0}, 0x80?) /usr/local/go/src/database/sql/sql.go:1726 +0xfc fp=0xc0330cb818 sp=0xc0330cb798 pc=0x16f325c database/sql.(DB).QueryContext.func1(0x80?) /usr/local/go/src/database/sql/sql.go:1704 +0x4f fp=0xc0330cb880 sp=0xc0330cb818 pc=0x16f304f database/sql.(DB).retry(0x62bdc8?, 0xc0330cb8f0) /usr/local/go/src/database/sql/sql.go:1538 +0x42 fp=0xc0330cb8c8 sp=0xc0330cb880 pc=0x16f1842 database/sql.(DB).QueryContext(0x0?, {0x5b5dcd8?, 0xc088c8d830?}, {0xc084129000?, 0x0?}, {0x0?, 0x5b5dcd8?, 0xc088c8d830?}) /usr/local/go/src/database/sql/sql.go:1703 +0xc5 fp=0xc0330cb958 sp=0xc0330cb8c8 pc=0x16f2f65 github.com/uptrace/bun.(SelectQuery).Rows(0xc084a30000, {0x5b5dcd8, 0xc088c8d830}) /go/pkg/mod/github.com/uptrace/bun@v1.1.16/query_select.go:818 +0x1a8 fp=0xc0330cba18 sp=0xc0330cb958 pc=0x2ce6668 github.com/kubecost/kubecost-cost-model/pkg/duckdb/internal/db.GetLabelsAnnotations({0x5b2a9a0, 0xc000d7cd40}, {0xc0502a0f80, 0x7, 0x8}, {0x80cab80, 0x0, 0x0}, {0xc055890f40, 0x1d}, ...) /app/kubecost-cost-model/pkg/duckdb/internal/db/common.go:124 +0x7db fp=0xc0330cbe20 sp=0xc0330cba18 pc=0x2d69fbb github.com/kubecost/kubecost-cost-model/pkg/duckdb/allocation/db.(AllocationDBQueryService).QueryAllocations.func1.1(0xc022a89ce0, {0xc02c037000, 0xf, 0x10}, {0x80cab80, 0x0, 0x0}, {0xc0502a0f80, 0x7, 0x8}, ...) /app/kubecost-cost-model/pkg/duckdb/allocation/db/allocationqueryservice.go:1358 +0x4bd fp=0xc0330cbf58 sp=0xc0330cbe20 pc=0x2d9b13d github.com/kubecost/kubecost-cost-model/pkg/duckdb/allocation/db.(AllocationDBQueryService).QueryAllocations.func1.5() /app/kubecost-cost-model/pkg/duckdb/allocation/db/allocationqueryservice.go:1380 +0x91 fp=0xc0330cbfe0 sp=0xc0330cbf58 pc=0x2d9ac31 runtime.goexit() /usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc0330cbfe8 sp=0xc0330cbfe0 pc=0x684b81 created by github.com/kubecost/kubecost-cost-model/pkg/duckdb/allocation/db.(AllocationDBQueryService).QueryAllocations.func1 in goroutine 2165198 /app/kubecost-cost-model/pkg/duckdb/allocation/db/allocationqueryservice.go:1338 +0x377

goroutine 2513177 [sync.Mutex.Lock]: runtime.gopark(0x2d51d7f?, 0x32253b0?, 0x98?, 0x96?, 0xc021ef9698?) /usr/local/go/src/runtime/proc.go:398 +0xce fp=0xc021ef9668 sp=0xc021ef9648 pc=0x651cee runtime.goparkunlock(...) /usr/local/go/src/runtime/proc.go:404 runtime.semacquire1(0xc004bda6a4, 0x0?, 0x3, 0x1, 0x60?) /usr/local/go/src/runtime/sema.go:160 +0x218 fp=0xc021ef96d0 sp=0xc021ef9668 pc=0x6631b8 sync.runtime_SemacquireMutex(0xc021ef9748?, 0xd1?, 0x18?) /usr/local/go/src/runtime/sema.go:77 +0x25 fp=0xc021ef9708 sp=0xc021ef96d0 pc=0x680b25 sync.(Mutex).lockSlow(0xc004bda6a0) /usr/local/go/src/sync/mutex.go:171 +0x15d fp=0xc021ef9758 sp=0xc021ef9708 pc=0x68fd1d sync.(Mutex).Lock(...) /usr/local/go/src/sync/mutex.go:90 database/sql.(*driverConn).finalClose(0xc0512c0ab0) /usr/local/go/src/database/sql/sql.go:648 +0x133 fp=0xc021ef9800 sp=0xc021ef9758 pc=0x16ed4d3 database/sql.finalCloser.finalClose-fm()

:1 +0x25 fp=0xc021ef9818 sp=0xc021ef9800 pc=0x16fcb45 database/sql.(*driverConn).Close(0xc0512c0ab0) /usr/local/go/src/database/sql/sql.go:623 +0x146 fp=0xc021ef9860 sp=0xc021ef9818 pc=0x16ed366 database/sql.(*DB).putConn(0xc004bda680, 0xc0512c0ab0, {0x0, 0x0}, 0x0?) /usr/local/go/src/database/sql/sql.go:1484 +0x2d6 fp=0xc021ef98d0 sp=0xc021ef9860 pc=0x16f1476 database/sql.(*driverConn).releaseConn(...) /usr/local/go/src/database/sql/sql.go:527 database/sql.(*driverConn).releaseConn-fm({0x0?, 0x0?}) :1 +0x3e fp=0xc021ef9908 sp=0xc021ef98d0 pc=0x16fca5e database/sql.(*Rows).close(0xc03900f950, {0x0, 0x0}) /usr/local/go/src/database/sql/sql.go:3396 +0x1c7 fp=0xc021ef9998 sp=0xc021ef9908 pc=0x16fae27 database/sql.(*Rows).Close(0x5b38d00?) /usr/local/go/src/database/sql/sql.go:3367 +0x26 fp=0xc021ef99c8 sp=0xc021ef9998 pc=0x16fac46 database/sql.(*Rows).Next(0xc03900f950) /usr/local/go/src/database/sql/sql.go:2997 +0x96 fp=0xc021ef9a18 sp=0xc021ef99c8 pc=0x16f91b6 github.com/kubecost/kubecost-cost-model/pkg/duckdb/internal/db.GetLabelsAnnotations({0x5b2a9a0, 0xc000d7cd40}, {0xc0502a1780, 0x7, 0x8}, {0x80cab80, 0x0, 0x0}, {0xc02d237a40, 0x1d}, ...) /app/kubecost-cost-model/pkg/duckdb/internal/db/common.go:136 +0x989 fp=0xc021ef9e20 sp=0xc021ef9a18 pc=0x2d6a169 github.com/kubecost/kubecost-cost-model/pkg/duckdb/allocation/db.(*AllocationDBQueryService).QueryAllocations.func1.1(0xc022a942c0, {0xc02c037400, 0xf, 0x10}, {0x80cab80, 0x0, 0x0}, {0xc0502a1780, 0x7, 0x8}, ...) /app/kubecost-cost-model/pkg/duckdb/allocation/db/allocationqueryservice.go:1358 +0x4bd fp=0xc021ef9f58 sp=0xc021ef9e20 pc=0x2d9b13d github.com/kubecost/kubecost-cost-model/pkg/duckdb/allocation/db.(*AllocationDBQueryService).QueryAllocations.func1.5() /app/kubecost-cost-model/pkg/duckdb/allocation/db/allocationqueryservice.go:1380 +0x91 fp=0xc021ef9fe0 sp=0xc021ef9f58 pc=0x2d9ac31 runtime.goexit() /usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc021ef9fe8 sp=0xc021ef9fe0 pc=0x684b81 created by github.com/kubecost/kubecost-cost-model/pkg/duckdb/allocation/db.(*AllocationDBQueryService).QueryAllocations.func1 in goroutine 2165198 /app/kubecost-cost-model/pkg/duckdb/allocation/db/allocationqueryservice.go:1338 +0x377 goroutine 2513180 [runnable]: runtime.cgocall(0x3225990, 0xc0353cd848) /usr/local/go/src/runtime/cgocall.go:157 +0x4b fp=0xc0353cd820 sp=0xc0353cd7e8 pc=0x61adcb github.com/marcboeker/go-duckdb._Cfunc_duckdb_result_get_chunk({0x2, 0x0, 0x0, 0x0, 0x0, 0x7f64d4db9a60}, 0x0) _cgo_gotypes.go:1465 +0x4c fp=0xc0353cd848 sp=0xc0353cd820 pc=0x2d52c0c github.com/marcboeker/go-duckdb.(*rows).Next.func2(0x666c69?) /go/pkg/mod/github.com/marcboeker/go-duckdb@v1.5.5/rows.go:63 +0x87 fp=0xc0353cd8d0 sp=0xc0353cd848 pc=0x2d56547 github.com/marcboeker/go-duckdb.(*rows).Next(0xc074e44f00, {0xc087be4b20, 0x2, 0x4c3a640?}) /go/pkg/mod/github.com/marcboeker/go-duckdb@v1.5.5/rows.go:63 +0x79 fp=0xc0353cd900 sp=0xc0353cd8d0 pc=0x2d562b9 database/sql.(*Rows).nextLocked(0xc07e6745a0) /usr/local/go/src/database/sql/sql.go:3019 +0x107 fp=0xc0353cd960 sp=0xc0353cd900 pc=0x16f9367 database/sql.(*Rows).Next.func1() /usr/local/go/src/database/sql/sql.go:2994 +0x29 fp=0xc0353cd988 sp=0xc0353cd960 pc=0x16f9229 database/sql.withLock({0x5b38d00, 0xc07e6745d8}, 0xc0353cd9e8) /usr/local/go/src/database/sql/sql.go:3502 +0x82 fp=0xc0353cd9c8 sp=0xc0353cd988 pc=0x16fb6c2 database/sql.(*Rows).Next(0xc07e6745a0) /usr/local/go/src/database/sql/sql.go:2993 +0x85 fp=0xc0353cda18 sp=0xc0353cd9c8 pc=0x16f91a5 github.com/kubecost/kubecost-cost-model/pkg/duckdb/internal/db.GetLabelsAnnotations({0x5b2a9a0, 0xc000d7cd40}, {0xc0502a1c00, 0x7, 0x8}, {0x80cab80, 0x0, 0x0}, {0xc056c9ff00, 0x1d}, ...) /app/kubecost-cost-model/pkg/duckdb/internal/db/common.go:136 +0x989 fp=0xc0353cde20 sp=0xc0353cda18 pc=0x2d6a169 github.com/kubecost/kubecost-cost-model/pkg/duckdb/allocation/db.(*AllocationDBQueryService).QueryAllocations.func1.1(0xc022a946e0, {0xc02c037700, 0xf, 0x10}, {0x80cab80, 0x0, 0x0}, {0xc0502a1c00, 0x7, 0x8}, ...) /app/kubecost-cost-model/pkg/duckdb/allocation/db/allocationqueryservice.go:1358 +0x4bd fp=0xc0353cdf58 sp=0xc0353cde20 pc=0x2d9b13d github.com/kubecost/kubecost-cost-model/pkg/duckdb/allocation/db.(*AllocationDBQueryService).QueryAllocations.func1.5() /app/kubecost-cost-model/pkg/duckdb/allocation/db/allocationqueryservice.go:1380 +0x91 fp=0xc0353cdfe0 sp=0xc0353cdf58 pc=0x2d9ac31 runtime.goexit() /usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc0353cdfe8 sp=0xc0353cdfe0 pc=0x684b81 created by github.com/kubecost/kubecost-cost-model/pkg/duckdb/allocation/db.(*AllocationDBQueryService).QueryAllocations.func1 in goroutine 2165198 /app/kubecost-cost-model/pkg/duckdb/allocation/db/allocationqueryservice.go:1338 +0x377 rax 0x0 rbx 0x7f64f498a640 rcx 0x7f653bdba9fc rdx 0x6 rdi 0x1 rsi 0x7 rbp 0x7 rsp 0x7f64f4987710 r8 0x7f64f49877e0 r9 0x7fffffff r10 0x8 r11 0x246 r12 0x6 r13 0x16 r14 0xc0000069c0 r15 0x4 rip 0x7f653bdba9fc rflags 0x246 cs 0x33 fs 0x0 gs 0x0`
michaelmdresser commented 9 months ago

@tabossert That's helpful, thank you for the quick response. I'm looking for the first instance of the goroutine ... string in the logs, 15-20 lines preceding that, and the stack trace attached to that specific goroutine. In Go, the stack trace for every goroutine is printed on a panic like this, but the offending goroutine's trace is printed first which is why I'm asking for that, plus the log context that lead us to that trace.

If you'd like, I can make it easier on you -- you can share the log file with me privately via email: michael@kubecost.com

michaelmdresser commented 9 months ago

To clarify: I need more log context to understand what's going wrong here. Please either share a full log file or share the requested first trace + surrounding context I mentioned above.

tabossert commented 9 months ago

Email sent with full log @michaelmdresser

michaelmdresser commented 9 months ago

Thank you @tabossert. I have a pretty strong theory about what's going wrong here -- there are a few different resolution paths if this is what I think it is.

If you are willing to try a pre-production release, please upgrade to Kubecost v2.1.0-rc.6 or v2.1.0 when it is released, which is imminent. I am fairly certain that you are experiencing an issue which has been fixed in v2.1.

Otherwise, if you would like to stay on v2.0.2:

  1. If this crash happens once per day, within a few hours of UTC midnight, disable the Forecasting Pod using the Helm value forecasting.enabled=false
  2. If this crash coincides with the run of your "scripts to pull data out" and if those scripts make a call to /model/allocation, please set the query parameter includeAggregatedMetadata=false. Also, if these queries have no aggregate parameter (or a high-cardinality one like aggregate=pod), I recommend using the limit and offset query parameters to paginate the response, e.g. limit=100&offset=0 -> limit=100&offset=100 -> limit=100&offset=200.
tabossert commented 9 months ago

Thanks, we will try those workarounds until the v2.1.0 is released. Thanks for the quick response!

tabossert commented 9 months ago

I tried upgrading to 2.1.0-rc6, but it wasn't seeming to load the data, so not sure if I missed something, I went to go back to 2.0.2 but now it gives me this error `│ 2024-02-22T01:16:58.356391917Z ERR error doing initial open of DB: error opening db at path /var/configs/waterfowl/duckdb/v0_9_2/kubecost.duckdb.write: migrating up: no migration found for version 20240212233831: read down for version 20240212233831 migrations: file does not exist │ │ panic: runtime error: invalid memory address or nil pointer dereference │ │ [signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x16ee895] │ │ │ │ goroutine 27 [running]: │ │ database/sql.(DB).Close(0x0) │ │ /usr/local/go/src/database/sql/sql.go:877 +0x35 │ │ github.com/kubecost/kubecost-cost-model/pkg/duckdb/write.startIngestor(0xc0009ccba0, 0xc000f0f4b0?) │ │ /app/kubecost-cost-model/pkg/duckdb/write/writer.go:234 +0x28 │ │ github.com/kubecost/kubecost-cost-model/pkg/duckdb/write.NewWriter.func5({0x47a00a0?, 0xc0001feba0?}, 0x1?) │ │ /app/kubecost-cost-model/pkg/duckdb/write/writer.go:125 +0x1b │ │ github.com/looplab/fsm.(FSM).enterStateCallbacks(0xc000f12000, {0x5b5dd10, 0xc0000da5f0}, 0xc0001feba0?) │ │ /go/pkg/mod/github.com/looplab/fsm@v1.0.1/fsm.go:470 +0x82 │ │ github.com/looplab/fsm.(FSM).Event.(FSM).Event.func2.func3() │ │ /go/pkg/mod/github.com/looplab/fsm@v1.0.1/fsm.go:363 +0x150 │ │ github.com/looplab/fsm.transitionerStruct.transition(...) │ │ /go/pkg/mod/github.com/looplab/fsm@v1.0.1/fsm.go:422 │ │ github.com/looplab/fsm.(FSM).doTransition(...) │ │ /go/pkg/mod/github.com/looplab/fsm@v1.0.1/fsm.go:407 │ │ github.com/looplab/fsm.(FSM).Event(0xc000f12000, {0x5b5d8e8, 0x80cab80}, {0x4e18562, 0xd}, {0x0, 0x0, 0x0}) │ │ /go/pkg/mod/github.com/looplab/fsm@v1.0.1/fsm.go:390 +0x884 │ │ github.com/kubecost/kubecost-cost-model/pkg/duckdb/write.NewWriter(0xc0012b70a0, {0xc00128de40, 0x3a}, {0xc00128df40, 0x39}) │ │ /app/kubecost-cost-model/pkg/duckdb/write/writer.go:180 +0x6ef │ │ github.com/kubecost/kubecost-cost-model/pkg/duckdb/orchestrator.createWriter(0xc0012b7040) │ │ /app/kubecost-cost-model/pkg/duckdb/orchestrator/orchestrator.go:398 +0x33 │ │ github.com/kubecost/kubecost-cost-model/pkg/duckdb/orchestrator.NewOrchestrator.func7({0x47a00a0?, 0xc0013ba3f0?}, 0xc000f08000) │ │ /app/kubecost-cost-model/pkg/duckdb/orchestrator/orchestrator.go:212 +0x25 │ │ github.com/looplab/fsm.(FSM).enterStateCallbacks(0xc0013bc500, {0x5b5dd10, 0xc0000da500}, 0xc0013ba3f0?) │ │ /go/pkg/mod/github.com/looplab/fsm@v1.0.1/fsm.go:470 +0x82 │ │ github.com/looplab/fsm.(FSM).Event.(FSM).Event.func2.func3() │ │ /go/pkg/mod/github.com/looplab/fsm@v1.0.1/fsm.go:363 +0x150 │ │ github.com/looplab/fsm.transitionerStruct.transition(...) │ │ /go/pkg/mod/github.com/looplab/fsm@v1.0.1/fsm.go:422 │ │ github.com/looplab/fsm.(FSM).doTransition(...) │ │ /go/pkg/mod/github.com/looplab/fsm@v1.0.1/fsm.go:407 │ │ github.com/looplab/fsm.(*FSM).Event(0xc0013bc500, {0x5b5d8e8, 0x80cab80}, {0x4e4a44e, 0x1b}, {0x0, 0x0, 0x0}) │ │ /go/pkg/mod/github.com/looplab/fsm@v1.0.1/fsm.go:390 +0x884 │ │ github.com/kubecost/kubecost-cost-model/pkg/duckdb/orchestrator.NewOrchestrator.func6.1() │ │ /app/kubecost-cost-model/pkg/duckdb/orchestrator/orchestrator.go:204 +0x3e │ │ created by github.com/kubecost/kubecost-cost-model/pkg/duckdb/orchestrator.NewOrchestrator.func6 in goroutine 1 │ │ /app/kubecost-cost-model/pkg/duckdb/orchestrator/orchestrator.go:203 +0x505`

tabossert commented 9 months ago

Actually just did an upgrade to 2.1.0 that was just released and that seems to be loading, will report back if the crashes stop

michaelmdresser commented 9 months ago

Thanks for the update and sorry for the confusion about the back-and-forth upgrade. Please let us know if you run into trouble with 2.1.0.

tabossert commented 8 months ago

Issue seems to be resolved, thanks!

wiktor2200 commented 7 months ago

Hello everyone! @michaelmdresser I experience the similar issue on GKE cluster in version 2.2.2 so it seems to be back. It works and suddenly it stopped working. Here's the full go trace:

INF Starting Kubecost Aggregator version kcm-c630c42588_core-c3cb2218df_oc-088f891d8e (c630c425)                                                                                
INF NAMESPACE: kubecost                                                                                                                                                         
ERR error doing initial open of DB: error opening db at path /var/configs/waterfowl/duckdb/v0_9_2/kubecost.duckdb.write: setting up migrations: opening '/var/configs/waterfowl/d
panic: runtime error: invalid memory address or nil pointer dereference                                                                                                                                        
[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x17774f5]                                                                                                                                       

goroutine 22 [running]:                                                                                                                                                                                        
database/sql.(*DB).Close(0x0)                                                                                                                                                                                  
    /usr/local/go/src/database/sql/sql.go:910 +0x35                                                                                                                                                            
github.com/kubecost/kubecost-cost-model/pkg/duckdb/write.startIngestor(0xc001f93d40, 0xc000afe060)                                                                                                             
    /app/kubecost-cost-model/pkg/duckdb/write/writer.go:342 +0x28                                                                                                                                              
github.com/kubecost/kubecost-cost-model/pkg/duckdb/write.NewWriter.func5({0x461e2c0?, 0xc0014a0a20?}, 0xc0016d1208?)                                                                                           
    /app/kubecost-cost-model/pkg/duckdb/write/writer.go:188 +0x1b                                                                                                                                              
github.com/looplab/fsm.(*FSM).enterStateCallbacks(0xc0014a7c00, {0x63c1568, 0xc003e16190}, 0xc000be00e0)                                                                                                       
    /root/go/pkg/mod/github.com/looplab/fsm@v1.0.1/fsm.go:470 +0x82                                                                                                                                            
github.com/looplab/fsm.(*FSM).Event.(*FSM).Event.func2.func3()                                                                                                                                                 
    /root/go/pkg/mod/github.com/looplab/fsm@v1.0.1/fsm.go:363 +0x150                                                                                                                                           
github.com/looplab/fsm.transitionerStruct.transition(...)                                                                                                                                                      
    /root/go/pkg/mod/github.com/looplab/fsm@v1.0.1/fsm.go:422                                                                                                                                                  
github.com/looplab/fsm.(*FSM).doTransition(...)                                                                                                                                                                
    /root/go/pkg/mod/github.com/looplab/fsm@v1.0.1/fsm.go:407                                                                                                                                                  
github.com/looplab/fsm.(*FSM).Event(0xc0014a7c00, {0x63c10f8, 0x8763380}, {0x4c3e1c7, 0x15}, {0x0, 0x0, 0x0})                                                                                                  
    /root/go/pkg/mod/github.com/looplab/fsm@v1.0.1/fsm.go:390 +0x80a                                                                                                                                           
github.com/kubecost/kubecost-cost-model/pkg/duckdb/write.NewWriter.func7({0x461e2c0?, 0xc0014a0a20?}, 0xc000be0070)                                                                                            
    /app/kubecost-cost-model/pkg/duckdb/write/writer.go:202 +0x11a                                                                                                                                             
github.com/looplab/fsm.(*FSM).enterStateCallbacks(0xc0014a7c00, {0x63c1568, 0xc003e160a0}, 0xc000be0070)                                                                                                       
    /root/go/pkg/mod/github.com/looplab/fsm@v1.0.1/fsm.go:470 +0x82                                                                                                                                            
github.com/looplab/fsm.(*FSM).Event.(*FSM).Event.func2.func3()                                                                                                                                                 
    /root/go/pkg/mod/github.com/looplab/fsm@v1.0.1/fsm.go:363 +0x150                                                                                                                                           
github.com/looplab/fsm.transitionerStruct.transition(...)                                                                                                                                                      
    /root/go/pkg/mod/github.com/looplab/fsm@v1.0.1/fsm.go:422                                                                                                                                                  
github.com/looplab/fsm.(*FSM).doTransition(...)                                                                                                                                                                
    /root/go/pkg/mod/github.com/looplab/fsm@v1.0.1/fsm.go:407                                                                                                                                                  
github.com/looplab/fsm.(*FSM).Event(0xc0014a7c00, {0x63c10f8, 0x8763380}, {0x4c22c8b, 0xd}, {0x0, 0x0, 0x0})                                                                                                   
    /root/go/pkg/mod/github.com/looplab/fsm@v1.0.1/fsm.go:390 +0x80a                                                                                                                                           
github.com/kubecost/kubecost-cost-model/pkg/duckdb/write.NewWriter(0xc000afe060, {0xc003aa81c0, 0x3a}, {0xc003aa82c0, 0x39})                                                                                   
    /app/kubecost-cost-model/pkg/duckdb/write/writer.go:258 +0x7be                                                                                                                                             
github.com/kubecost/kubecost-cost-model/pkg/duckdb/orchestrator.createWriter(0xc000afe000)                                                                                                                     
    /app/kubecost-cost-model/pkg/duckdb/orchestrator/orchestrator.go:400 +0x33                                                                                                                                 
github.com/kubecost/kubecost-cost-model/pkg/duckdb/orchestrator.NewOrchestrator.func7({0x461e2c0?, 0xc003a8c510?}, 0xc000adf420)                                                    
    /app/kubecost-cost-model/pkg/duckdb/orchestrator/orchestrator.go:213 +0x25                                                                                                      
github.com/looplab/fsm.(*FSM).enterStateCallbacks(0xc003a98d80, {0x63c1568, 0xc0016c8050}, 0xc000adf420)                                                                            
    /root/go/pkg/mod/github.com/looplab/fsm@v1.0.1/fsm.go:470 +0x82                                                                                                                 
github.com/looplab/fsm.(*FSM).Event.(*FSM).Event.func2.func3()                                                                                                                      
    /root/go/pkg/mod/github.com/looplab/fsm@v1.0.1/fsm.go:363 +0x150                                                                                                                
github.com/looplab/fsm.transitionerStruct.transition(...)                                                                                                                           
    /root/go/pkg/mod/github.com/looplab/fsm@v1.0.1/fsm.go:422                                                                                                                       
github.com/looplab/fsm.(*FSM).doTransition(...)                                                                                                                                     
    /root/go/pkg/mod/github.com/looplab/fsm@v1.0.1/fsm.go:407                                                                                                                       
github.com/looplab/fsm.(*FSM).Event(0xc003a98d80, {0x63c10f8, 0x8763380}, {0x4c563b8, 0x1b}, {0x0, 0x0, 0x0})                                                                       
    /root/go/pkg/mod/github.com/looplab/fsm@v1.0.1/fsm.go:390 +0x80a                                                                                                                
github.com/kubecost/kubecost-cost-model/pkg/duckdb/orchestrator.NewOrchestrator.func6.1()                                                                                           
    /app/kubecost-cost-model/pkg/duckdb/orchestrator/orchestrator.go:205 +0x3e                                                                                                      
created by github.com/kubecost/kubecost-cost-model/pkg/duckdb/orchestrator.NewOrchestrator.func6 in goroutine 1                                                                     
    /app/kubecost-cost-model/pkg/duckdb/orchestrator/orchestrator.go:204 +0x4e8      
wiktor2200 commented 7 months ago

I have resolved the issue above with according to this message: https://github.com/kubecost/features-bugs/issues/72