asmirnou / watsor-helm-chart

Helm chart to deploy Watsor on Kubernetes
MIT License
2 stars 1 forks source link

k3s crashloop #1

Closed shmrymbd closed 3 years ago

shmrymbd commented 3 years ago

Hi Asmirnou, Failed to run CrashLoopBackOff which i am using jetson nano k3s cluster, using helm to install by default configuration. can you help?

asmirnou commented 3 years ago

Use kubectl to find out why it crashes:

kubectl -n <namespace-name> describe pod <pod name>
kubectl -n <namespace-name> logs <pod name> -f

I haven't tested k3s, but In my minikube it works fine.

Also, I don't think that any version of kubernetes can utilise full potential of Jetson Nano, where the infrastructure for object detection is provided out-of-the-box. A container merely can't reach Jetson's GPU. The best way to install Watsor on Jetson Nano is using the following guide.

shmrymbd commented 3 years ago

This is my finding..


root@ddd-desktop:~# kubectl -n default describe pod watsor-b65779566-hzbqt
Name:         watsor-b65779566-hzbqt
Namespace:    default
Priority:     0
Node:         ddd-desktop/192.168.1.12
Start Time:   Wed, 11 Nov 2020 01:02:27 +0800
Labels:       app.kubernetes.io/instance=watsor
              app.kubernetes.io/name=watsor
              pod-template-hash=b65779566
Annotations:  
Status:       Running
IP:           10.42.2.7
IPs:
  IP:           10.42.2.7
Controlled By:  ReplicaSet/watsor-b65779566
Containers:
  watsor:
    Container ID:   containerd://7377756936b8e112b94ee16fd2f9e4ec15fb9640deb92b26246d2aa855c4dc46
    Image:          smirnou/watsor:1.0.4
    Image ID:       docker.io/smirnou/watsor@sha256:7584d79a10868860df9a45fe13ae4f1a701df47ce42d09b9050b0dc46bdf1cce
    Port:           8080/TCP
    Host Port:      0/TCP
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Wed, 11 Nov 2020 02:04:36 +0800
      Finished:     Wed, 11 Nov 2020 02:04:36 +0800
    Ready:          False
    Restart Count:  17
    Liveness:       http-get http://:http/health delay=0s timeout=1s period=10s #success=1 #failure=3
    Readiness:      http-get http://:http/health delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:    
    Mounts:
      /dev/shm from dshm (rw)
      /etc/watsor/config.yaml from config (ro,path="config.yaml")
      /var/run/secrets/kubernetes.io/serviceaccount from watsor-token-2rlfp (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      watsor
    Optional:  false
  dshm:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     Memory
    SizeLimit:  512Mi
  watsor-token-2rlfp:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  watsor-token-2rlfp
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason   Age                   From     Message
  ----     ------   ----                  ----     -------
  Warning  BackOff  2m2s (x309 over 67m)  kubelet  Back-off restarting failed container

and

root@ddd-desktop:~# kubectl -n default logs  watsor-b65779566-hzbqt -f
standard_init_linux.go:219: exec user process caused: exec format error

            
shmrymbd commented 3 years ago

Hi

Also, I don't think that any version of kubernetes can utilise full potential of Jetson Nano, where the infrastructure for object detection is provided out-of-the-box. A container merely can't reach Jetson's GPU. The best way to install Watsor on Jetson Nano is using the following guide.

Following the instruction from https://github.com/opendatacam/opendatacam for installation using k3s cluster on 3 units Jetson Nano and work like charm on the Jetson GPU (monitored using jtop)

Will try again :-)

asmirnou commented 3 years ago

standard_init_linux.go:219: exec user process caused: exec format error

Right, it happens because Jetson Nano has ARM architecture while the default container image was built for x86-64 architecture.

There is no Docker image for Jetson at the moment, that you could specify as Helm chart parameter, so better install Watsor as Python module as recommended above.

Thanks for letting me know about opendatacam. I'll take a look and maybe create a dedicated Docker image later.

asmirnou commented 3 years ago

FYI, if it's still actual, I built is a new Docker image for Jetson devices (Xavier, TX2, and Nano). The image is based on L4T and can be run using the NVIDIA Container Toolkit. The platform specific libraries and drivers are mounted by the NVIDIA container runtime into the container from the underlying Jetson device.

Tested on Jetson Nano in Docker. I assume it should run on k3c using Nvidia k8s plugin, but I haven't tried honestly.