Closed fira42073 closed 2 weeks ago
Remote command executions such as exec, attach, cp, etc. handle the SIGQUIT (triggered by ctrl + \
) properly by exiting after throwing ^\command terminated with exit code 131
error. On the other hand, as @fira42073 stated log
command panicks unexpectedly. This looks to me an issue.
I'm willing to contribute if you could point me to some possible reasons. Right now I'm not sure what's happening, because this doesn't really look like a panic either. I see some synchronization primitives in the log like sync.(*Cond).Wait, so I assume it may be some concurrency issue.
ChatGPT suggests a similar conclusion
The trace you shared appears to be from a Go program involving several goroutines, most of which are idle garbage collector (GC) workers. Here's a quick breakdown of some key elements:
Garbage Collector Workers:
- The majority of the goroutines in this trace, like
goroutine 35
,goroutine 18
, and others, are GC background workers. They are in an idle state waiting for the garbage collector to signal work (runtime.gopark
is waiting for the GC system to start).- These goroutines are created by the Go runtime to handle the process of garbage collection, which reclaims memory that is no longer in use.
Network Communication:
- Goroutine 43 seems to be waiting for IO related to network communication. It is using the
netpollblock
mechanism, which is typical for blocking on network events like reading or writing to a socket.- Goroutine 59 is part of a
select
block, commonly used in Go to manage multiple channel operations or asynchronous IO. It could be handling communication or synchronization between goroutines.TLS (Transport Layer Security):
- Goroutines 43 and 59 also include parts where data is being read from a
crypto/tls
connection, which is involved in secure communications (likely HTTPS). These involve operations like reading from the network and handling TLS record processing.HTTP2 Communication:
- Goroutine 59 also handles HTTP2 frame reading (
golang.org/x/net/http2.readFrameHeader
). This likely indicates that the program is communicating over HTTP2, which is common in modern web services.Potential Areas to Check:
- Garbage Collection: A large number of idle GC workers may not necessarily be a problem, but you should ensure that garbage collection isn't causing unnecessary overhead, especially in high-load scenarios.
- Network Performance: If you're experiencing slow performance, some of the goroutines seem to be involved in network IO. Checking for blocked or slow connections might help.
- Concurrency: Look at how many goroutines are running and ensure that the system isn't creating more goroutines than needed, as this can lead to excess memory consumption or context switching overhead.
func (o LogsOptions) RunLogs() error {
var requests map[corev1.ObjectReference]rest.ResponseWrapper
var err error
if o.AllPods {
requests, err = o.AllPodLogsForObject(o.RESTClientGetter, o.Object, o.Options, o.GetPodTimeout, o.AllContainers)
} else {
requests, err = o.LogsForObject(o.RESTClientGetter, o.Object, o.Options, o.GetPodTimeout, o.AllContainers)
}
if err != nil {
return err
}
intr := interrupt.New(nil, func() {})
return intr.Run(func() error {
if o.Follow && len(requests) > 1 {
if len(requests) > o.MaxFollowConcurrency {
return fmt.Errorf(
"you are attempting to follow %d log streams, but maximum allowed concurrency is %d, use --max-log-requests to increase the limit",
len(requests), o.MaxFollowConcurrency,
)
}
return o.parallelConsumeRequest(requests)
}
return o.sequentialConsumeRequest(requests)
})
}
@fira42073 You can try this.
/triage accepted
@xyz-li your recommendation above is reasonable and I opened a PR to fix this issue. Thank you. /priority backlog
What happened: After running running
kubectl logs -f deploy/gitlab-webservice-default
and then pressing ctl + \I get the stacktrace and it crashes (see anything else we need to know section)
What you expected to happen:
I'm not sure about the correct behaviour, but it seems like it received the SIGQUIT signal, so probably quit?
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
stacktrace?
Environment:
kubectl version
):gcp cluster
kind setup
Tried this both in working gcp cluster and kind. Same issue.
OS (e.g:
cat /etc/os-release
):Package