reanahub / reana-job-controller

REANA Job Controller
http://reana-job-controller.readthedocs.io/
MIT License
2 stars 38 forks source link

k8s: pending pod status retrieval failure #268

Closed diegodelemos closed 4 years ago

diegodelemos commented 4 years ago

To find out what's the reason behind a pod being in Pending state we read the containerStatuses, however, this field is apparently nullable, which makes our code break. Seen in reana-qa.cern.ch:

$ kubectl logs reana-run-batch-xxxx-yyy job-controller
2020-08-07 15:39:15,527 | root | kubernetes_job_monitor | INFO | New Pod event received: ADDED
2020-08-07 15:39:15,529 | root | kubernetes_job_monitor | INFO | New Pod event received: ADDED
2020-08-07 15:39:15,531 | root | kubernetes_job_monitor | INFO | New Pod event received: ADDED
2020-08-07 15:39:15,531 | root | kubernetes_job_monitor | ERROR | Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/reana_job_controller/job_monitor.py", line 256, in watch_jobs
    job_status = self.get_job_status(job_pod)
  File "/usr/local/lib/python3.6/site-packages/reana_job_controller/job_monitor.py", line 175, in get_job_status
    job_pod.status.init_container_statuses or []
TypeError: unsupported operand type(s) for +: 'NoneType' and 'list'

2020-08-07 15:39:15,531 | root | kubernetes_job_monitor | ERROR | Unexpected error: unsupported operand type(s) for +: 'NoneType' and 'list'
2020-08-07 15:39:15,568 | root | kubernetes_job_monitor | INFO | New Pod event received: ADDED
2020-08-07 15:39:15,571 | root | kubernetes_job_monitor | INFO | New Pod event received: ADDED
2020-08-07 15:39:15,574 | root | kubernetes_job_monitor | INFO | New Pod event received: ADDED
tiborsimko commented 4 years ago

Merged to maint-0.7