solo-io / kubesquash

A debugger for Kubernetes applications.
228 stars 15 forks source link

Can't connect to cri socket #12

Open rohitp93 opened 5 years ago

rohitp93 commented 5 years ago

I am just trying to connect to a pod in my cluster, I get this error after selecting a pod

error: cannot attach a container in a completed pod; current phase is Failed Pod errored with: Logs: ERROR: logging before flag.Parse: I1213 08:24:13.411387 11627 remote_runtime.go:43] Connecting to runtime service /var/run/cri.sock ERROR: logging before flag.Parse: W1213 08:24:13.412053 11627 util_unix.go:75] Using "/var/run/cri.sock" as endpoint is deprecated, please consider using full url format "unix:///var/run/cri.sock". ERROR: logging before flag.Parse: I1213 08:24:13.416130 11627 remote_runtime.go:43] Connecting to runtime service unix:///var/run/cri.sock INFO[0000] found some pids potentialpids="[17951 17978 17980 18122 18124 18126 18129 18132 18134 18137 18139 18141 18143 18144 18145 18146 18147 18148 18149 18150 18151 18152 18153 18155 18156 18157 18158 18159 18160 18162 18163 18164 18168 18169 18170 18171 18172 18173]" INFO[0000] attaching with dlv pid=17951 could not attach to pid 17951: can't open separate debug file: open /usr/lib/debug/.build-id/04/eca96c5bf3e9a300952a29ef3218f00487d37b.debug: no such file or directory exit status 1

posix4e commented 5 years ago

I got basically the same thing

posix4e commented 5 years ago
ERROR: logging before flag.Parse: I1218 03:15:54.589947   13806 remote_runtime.go:43] Connecting to runtime service /var/run/cri.sock
ERROR: logging before flag.Parse: W1218 03:15:54.590181   13806 util_unix.go:75] Using "/var/run/cri.sock" as endpoint is deprecated, please consider using full url format "unix:///var/run/cri.sock".
ERROR: logging before flag.Parse: I1218 03:15:54.602834   13806 remote_runtime.go:43] Connecting to runtime service unix:///var/run/cri.sock
WARN[0000] Invalid number of pods                        items="([]*v1alpha2.PodSandbox) <nil>\n"
FATA[0000] debug failed!                                 err="Invalid number of pods"
Pod errored with: <nil>
 Logs:
 ERROR: logging before flag.Parse: I1218 03:15:54.589947   13806 remote_runtime.go:43] Connecting to runtime service /var/run/cri.sock
ERROR: logging before flag.Parse: W1218 03:15:54.590181   13806 util_unix.go:75] Using "/var/run/cri.sock" as endpoint is deprecated, please consider using full url format "unix:///var/run/cri.sock".
ERROR: logging before flag.Parse: I1218 03:15:54.602834   13806 remote_runtime.go:43] Connecting to runtime service unix:///var/run/cri.sock
WARN[0000] Invalid number of pods                        items="([]*v1alpha2.PodSandbox) <nil>\n"
FATA[0000] debug failed!                                 err="Invalid number of pods"
pod is not running and not pending
posix4e commented 5 years ago

I am in azure

yuval-k commented 5 years ago

What version of kubernetes are you using? do you know if the location of your CRI has changed? can you provide the kubelet command line?

mazzy89 commented 5 years ago

Same here. kubernetes version 1.13.1.

This the result:

$ ls /var/run/*sock
/var/run/docker.sock
ksquash -crisock /var/run/docker.sock
ERROR: logging before flag.Parse: I0123 08:59:31.923747   31555 remote_runtime.go:43] Connecting to runtime service /var/run/cri.sock
ERROR: logging before flag.Parse: W0123 08:59:31.923857   31555 util_unix.go:75] Using "/var/run/cri.sock" as endpoint is deprecated, please consider using full url format "unix:///var/run/cri.sock".
ERROR: logging before flag.Parse: E0123 08:59:31.924413   31555 remote_runtime.go:434] Status from runtime service failed: rpc error: code = Unavailable desc = transport is closing
{"level":"fatal","ts":1548233971.9254968,"caller":"kubesquash-container/main.go:20","msg":"debug failed!","error":"rpc error: code = Unavailable desc = transport is closing","stacktrace":"main.main\n\t/workspace/cmd/kubesquash-container/main.go:20\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:201"}
Pod errored with: <nil>
 Logs:
 ERROR: logging before flag.Parse: I0123 08:59:31.923747   31555 remote_runtime.go:43] Connecting to runtime service /var/run/cri.sock
ERROR: logging before flag.Parse: W0123 08:59:31.923857   31555 util_unix.go:75] Using "/var/run/cri.sock" as endpoint is deprecated, please consider using full url format "unix:///var/run/cri.sock".
ERROR: logging before flag.Parse: E0123 08:59:31.924413   31555 remote_runtime.go:434] Status from runtime service failed: rpc error: code = Unavailable desc = transport is closing
{"level":"fatal","ts":1548233971.9254968,"caller":"kubesquash-container/main.go:20","msg":"debug failed!","error":"rpc error: code = Unavailable desc = transport is closing","stacktrace":"main.main\n\t/workspace/cmd/kubesquash-container/main.go:20\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:201"}
pod is not running and not pending
mazzy89 commented 5 years ago

In my case I have Pod Security Policy enabled in the cluster and the default one doesn't allow to mount hostPath. Since the Pod is created without the possibility to specify any ServiceAccount I can't use this. Not sure if the error message is directly connected to this

yuval-k commented 5 years ago

@mazzy89 thanks for the info - I can add a service account option to the kubesquash flags so this should help your issue.

we have also seen another issue related to kubelet running in a container. still figuring this one out

mazzy89 commented 5 years ago

Yeah thank you

mazzy89 commented 5 years ago

In my case since kubelet is run as container it was necessary to mount dockershim in the host machine. Once that it gets connected

yuval-k commented 5 years ago

To summarize, the issue was that kubelet was running inside a container. and there fore dockershim was not available out side the container for kubesquash to connect to. to solution was to change the kubelet configuration so it will create dockershim in a shared volume in the host.

Not sure what can we do better in this case, except maybe support docker.sock along with dockershim.sock as docker.sock is present on the host. docker.sock speaks a different API so that's a non trivial item to do. other ideas are also welcome. will keep this open for tracking

rvansa commented 5 years ago

I have (probably) the same issue, running on RHEL with Openshift 3.10 (Kubernetes v1.10.0). I had to remove the multi-stage build from Dockerfile.dlv since docker 1.13.1 doesn't support multi-stage.

Anyway, when I select the pod I get

? Select a namespace istio-system
? Select a pod istio-pilot-749777f9d8-r7v8x
? Select a container discovery
? Going to attach dlv to pod istio-pilot-749777f9d8-r7v8x. continue? Yes
Can't get logs from errored pod
pod is not running and not pending: container "kubesquash-container" in pod "kubesquash-container9c985" is not available

Trying to run docker run -it docker.io/rvansa/kubesquash-container-dlv:v0.1.10 gives me

ERROR: logging before flag.Parse: I0131 17:28:29.701350       1 remote_runtime.go:43] Connecting to runtime service /var/run/cri.sock
ERROR: logging before flag.Parse: W0131 17:28:29.701954       1 util_unix.go:75] Using "/var/run/cri.sock" as endpoint is deprecated, please consider using full url format "unix:///var/run/cri.sock".
ERROR: logging before flag.Parse: E0131 17:28:29.702885       1 remote_runtime.go:434] Status from runtime service failed: rpc error: code = Unavailable desc = grpc: the connection is unavailable
{"level":"fatal","ts":1548955709.7030885,"caller":"kubesquash-container/main.go:20","msg":"debug failed!","error":"rpc error: code = Unavailable desc = grpc: the connection is unavailable","stacktrace":"main.main\n\t/home/benchuser/rvansa/kubesquash-0.1.10/cmd/kubesquash-container/main.go:20\nruntime.main\n\t/usr/lib/golang/src/runtime/proc.go:201"}

(Note that I don't have /var/run/cri.sock is Kubesquash intended only for CRI-O based containers?)