derailed / k9s

🐶 Kubernetes CLI To Manage Your Clusters In Style!
https://k9scli.io
Apache License 2.0
26.62k stars 1.67k forks source link

Sporadic crashes #2532

Closed ventsislav-georgiev closed 7 months ago

ventsislav-georgiev commented 7 months ago




Describe the bug Recently started having sporadic crashes with the following stacktrace:

github.com/derailed/tview@v0.8.3/application.go:723
github.com/derailed/tview.(*Application).QueueUpdateDraw(...)
github.com/derailed/tview@v0.8.3/application.go:730
github.com/derailed/k9s/internal/ui.(*App).QueueUpdateDraw.func1()
github.com/derailed/k9s/internal/ui/app.go:79 +0xc0
created by github.com/derailed/k9s/internal/ui.(*App).QueueUpdateDraw in goroutine 5212
github.com/derailed/k9s/internal/ui/app.go:78 +0x78

goroutine 7417 [chan receive]:
github.com/derailed/k9s/internal/model.(*CmdBuff).Add.func1()
github.com/derailed/k9s/internal/model/cmd_buff.go:165 +0x3c
created by github.com/derailed/k9s/internal/model.(*CmdBuff).Add in goroutine 1
github.com/derailed/k9s/internal/model/cmd_buff.go:164 +0x1cc

goroutine 7530 [chan receive]:
github.com/derailed/tview.(*Application).QueueUpdate(...)
github.com/derailed/tview@v0.8.3/application.go:723
github.com/derailed/tview.(*Application).QueueUpdateDraw(...)
github.com/derailed/tview@v0.8.3/application.go:730
github.com/derailed/k9s/internal/ui.(*App).QueueUpdateDraw.func1()
github.com/derailed/k9s/internal/ui/app.go:79 +0xc0
created by github.com/derailed/k9s/internal/ui.(*App).QueueUpdateDraw in goroutine 5214
github.com/derailed/k9s/internal/ui/app.go:78 +0x7

This is just printed to the terminal and missing from the k9s log. Not sure if it will be helpful enough, it seems like incomplete crash log.

To Reproduce It just happens, it doesn't seem to be related to specific resource type and no interaction from the user is needed. Just starting k9s and let it run for a while then you see the error.

It may be related to the refresh speed, as it may crash more often if I lower the refresh speed to 1s.

Versions (please complete the following information):

ventsislav-georgiev commented 7 months ago

Here is stack from another crash and a screenshot how it looks in the terminal:

created by golang.org/x/net/http2.(*ClientConn).RoundTrip in goroutine 212
golang.org/x/net@v0.19.0/http2/transport.go:1232 +0x2d4

goroutine 16095 [select]:
golang.org/x/net/http2.(*clientStream).writeRequest(0x14000fb4600, 0x14000b91800)
golang.org/x/net@v0.19.0/http2/transport.go:1464 +0x900
golang.org/x/net/http2.(*clientStream).doRequest(0x1044428d0?, 0x1400181cfa0?)
golang.org/x/net@v0.19.0/http2/transport.go:1326 +0x20
created by golang.org/x/net/http2.(*ClientConn).RoundTrip in goroutine 356
golang.org/x/net@v0.19.0/http2/transport.go:1232 +0x2d4

goroutine 17059 [sync.Cond.Wait]:
sync.runtime_notifyListWait(0x14000f46948, 0x0)
runtime/sema.go:527 +0x154
sync.(*Cond).Wait(0x14000f46938)
sync/cond.go:70 +0xcc
golang.org/x/net/http2.(*pipe).Read(0x14000f46930, {0x140024c3200, 0x200, 0x200})
golang.org/x/net@v0.19.0/http2/pipe.go:76 +0x108
golang.org/x/net/http2.transportResponseBody.Read({0x14001024538?}, {0x140024c3200?, 0x14001024538?, 0x100f2765c?})
golang.org/x/net@v0.19.0/http2/transport.go:2558 +0x50
encoding/json.(*Decoder).refill(0x140000f9680)
encoding/json/stream.go:165 +0x164
encoding/json.(*Decoder).readValue(0x140000f9680)
encoding/json/stream.go:140 +0x88
encoding/json.(*Decoder).Decode(0x140000f9680, {0x103f4a5e0, 0x14002688870})
encoding/json/stream.go:63 +0x5c
k8s.io/apimachinery/pkg/util/framer.(*jsonFrameReader).Read(0x140026d0f00, {0x14005701400, 0x400, 0x400})
k8s.io/apimachinery@v0.29.1/pkg/util/framer/framer.go:152 +0x19c
k8s.io/apimachinery/pkg/runtime/serializer/streaming.(*decoder).Decode(0x1400077a780, 0x0?, {0x104462b10, 0x1400183ef80})
k8s.io/apimachinery@v0.29.1/pkg/runtime/serializer/streaming/streaming.go:77 +0x88
k8s.io/client-go/rest/watch.(*Decoder).Decode(0x140000f2b40)
k8s.io/client-go@v0.29.1/rest/watch/decoder.go:49 +0x5c
k8s.io/apimachinery/pkg/watch.(*StreamWatcher).receive(0x1400183ef40)
k8s.io/apimachinery@v0.29.1/pkg/watch/streamwatcher.go:105 +0xb0
created by k8s.io/apimachinery/pkg/watch.NewStreamWatcher in goroutine 54
k8s.io/apimachinery@v0.29.1/pkg/watch/streamwatcher.go:76 +0x12
image

The stacktrace is drawn in the k9s view instead of output to the terminal. This is the reason it is incomplete. Crashed after 34 min.

ventsislav-georgiev commented 7 months ago

Not sure if related but this is using iTerm2 inside tmux session.

derailed commented 7 months ago

@ventsislav-georgiev Thank you for reporting this! I have k9s up and running for days with zero issues. I suspect something specific about your setup and best guess here is a lock contention issue. There is not enough info in the stacks for root cause ;( I think if you launch k9s and redirect stderr to a file we should be able to get a full stack and figure out why this is happening. ie k9s 2>>/tmp/bozo.log

ventsislav-georgiev commented 7 months ago

Thanks @derailed! That worked. Here is a gist with a complete crash log: https://gist.github.com/ventsislav-georgiev/846c06e91ed955c0deef8e27a08ad1ad

ventsislav-georgiev commented 7 months ago

Here are two more:

derailed commented 7 months ago

@ventsislav-georgiev Thank you so much for the great details!! I couldn't get a repro here at the ranch?? but think I might have found a potential issue?? (read shot in the dark!). I'll push on the next drop