To find out what's the reason behind a pod being in Pending state we read the containerStatuses, however, this field is apparently nullable, which makes our code break. Seen in reana-qa.cern.ch:
$ kubectl logs reana-run-batch-xxxx-yyy job-controller
2020-08-07 15:39:15,527 | root | kubernetes_job_monitor | INFO | New Pod event received: ADDED
2020-08-07 15:39:15,529 | root | kubernetes_job_monitor | INFO | New Pod event received: ADDED
2020-08-07 15:39:15,531 | root | kubernetes_job_monitor | INFO | New Pod event received: ADDED
2020-08-07 15:39:15,531 | root | kubernetes_job_monitor | ERROR | Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/reana_job_controller/job_monitor.py", line 256, in watch_jobs
job_status = self.get_job_status(job_pod)
File "/usr/local/lib/python3.6/site-packages/reana_job_controller/job_monitor.py", line 175, in get_job_status
job_pod.status.init_container_statuses or []
TypeError: unsupported operand type(s) for +: 'NoneType' and 'list'
2020-08-07 15:39:15,531 | root | kubernetes_job_monitor | ERROR | Unexpected error: unsupported operand type(s) for +: 'NoneType' and 'list'
2020-08-07 15:39:15,568 | root | kubernetes_job_monitor | INFO | New Pod event received: ADDED
2020-08-07 15:39:15,571 | root | kubernetes_job_monitor | INFO | New Pod event received: ADDED
2020-08-07 15:39:15,574 | root | kubernetes_job_monitor | INFO | New Pod event received: ADDED
[ ] 1. Figure out how to know if the field is nullable (the docs do not mention it, or at least I couldn't see it)
[ ] 2. Once 1. is done, review all the code and check if we are using more nullable fields as non nullable.
To find out what's the reason behind a pod being in Pending state we read the
containerStatuses
, however, this field is apparently nullable, which makes our code break. Seen in reana-qa.cern.ch:1.
is done, review all the code and check if we are using more nullable fields as non nullable.