alcounit / selenosis

Scalable, stateless selenium hub for Kubernetes cluster
Apache License 2.0
81 stars 24 forks source link

Browser pods get stuck #31

Closed shlomitsur closed 3 years ago

shlomitsur commented 3 years ago

Hello, After some time my Chrome browser pods get stuck. I run with browser-limit=1000, 10 replicas. This is my service /status: {"status":200,"version":"v1.0.2","selenosis":{"total":1000,"active":1487,"pending":953,"config":{"chrome":["89.0"]},"sessions":[{"id":"selenoid-chrome-89-0-085161ae-79c8-4cc9-8a33-13b8df55a24a","labels":{"browserName":"chrome","browserVersion":"89.0","testName":""},"started":"2021-04-05T06:14:07Z","uptime":"1716.12s"},{"id":"selenoid-chrome-89-0-5e4ed75b-4943-42d6-a5c0-c7baabf189fe","labels":{"browserName":"chrome","browserVersion":"89.0","testName":""},"started":"2021-04-05T05:16:48Z","uptime":"5155.12s"},{"id":"selenoid-chrome-89-0-a1b408e4-e966-4a44-aa4c-762b66bfa5de","labels":{"browserName":"chrome","browserVersion":"89.0","testName":""},"started":"2021-04-05T05:14:38Z","uptime":"5285.12s"},{"id":"selenoid-chrome-89-0-d7edddb2-f1a6-4e78-a003-36f88aac45f2","labels":{"browserName":"chrome","browserVersion":"89.0","testName":""},"started":"2021-04-05T04:27:53Z","uptime":"8090.12s"},{"id":"selenoid-chrome-89-0-5f4ae19e-d06f-4522-ad8e-9be3a2eee0e5","labels":{"browserName":"chrome","browserVersion":"89.0","testName":""},"started":"2021-04-04T21:14:20Z","uptime":"34103.12s"},{"id":"selenoid-chrome-89-0-72caed14-e136-49cb-a811-ef79013f0a60","labels":

These are the events of one of the unterminated chrome pods:

Events:
  Type     Reason                  Age                  From     Message
  ----     ------                  ----                 ----     -------
  Warning  FailedCreatePodSandBox  54m (x4 over 97m)    kubelet  Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create a sandbox for pod "selenoid-chrome-89-0-fe06cf7c-6dc2-454b-85b5-ccf2b8488925": operation timeout: context deadline exceeded
  Normal   SandboxChanged          51m (x8 over 91m)    kubelet  Pod sandbox changed, it will be killed and re-created.
  Normal   Pulled                  47m                  kubelet  Container image "selenoid/chrome:89.0" already present on machine
  Normal   Created                 46m                  kubelet  Created container browser
  Normal   Started                 45m                  kubelet  Started container browser
  Warning  FailedSync              29m (x2 over 29m)    kubelet  error determining status: rpc error: code = Unknown desc = Error: No such container: 2421f6056f9b5ec64e2f537b4dad482eaae8da6526a4eb4b237ef711e5b4cf34
  Normal   Pulled                  21m (x4 over 45m)    kubelet  Container image "alcounit/seleniferous:v0.0.3-develop" already present on machine
  Warning  Failed                  19m (x4 over 43m)    kubelet  Error: context deadline exceeded
  Warning  FailedSync              16m (x17 over 19m)   kubelet  error determining status: rpc error: code = Unknown desc = Error: No such container: 4a568fe0d325d17556087fb64b4f1a49da0b43a83bc24c89b2ae725ffe22bf4b
  Warning  FailedKillPod           10m                  kubelet  error killing pod: [failed to "KillContainer" for "browser" with KillContainerError: "rpc error: code = Unknown desc = operation timeout: context deadline exceeded", failed to "KillPodSandbox" for "3d753a9c-02c3-45a6-96be-23f7096bd393" with KillPodSandboxError: "rpc error: code = DeadlineExceeded desc = context deadline exceeded"]
  Warning  FailedKillPod           44s (x5 over 8m54s)  kubelet  error killing pod: failed to "KillContainer" for "browser" with KillContainerError: "rpc error: code = Unknown desc = operation timeout: context deadline exceeded"
alcounit commented 3 years ago

Hi @shlomitsur, you closed the issue, is it solved?

shlomitsur commented 3 years ago

Yes thanks @alcounit it might be because of something I did to the cluster. not sure.