Open mfeit-internet2 opened 3 years ago
It seems to be a problem with the docker version. With the latest version 20.10.8 it doesn't work, but with versions 20.10.0 and 19.03.9 the tasks run correctly, even though pscheduler processes keep restarting all the time. I still haven't figured out what changed between versions to cause this.
Here, a snippet of the container log with docker 19.03.9
2021-09-28 18:56:42,952 INFO spawned: 'pscheduler-scheduler' with pid 1430
2021-09-28 18:56:42,952 INFO success: pscheduler-runner entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2021-09-28 18:56:43,600 INFO exited: pscheduler-archiver (exit status 1; not expected)
2021-09-28 18:56:43,768 INFO spawned: 'pscheduler-archiver' with pid 1432
2021-09-28 18:56:44,300 INFO success: pscheduler-scheduler entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2021-09-28 18:56:44,344 INFO exited: pscheduler-ticker (exit status 1; not expected)
2021-09-28 18:56:44,900 INFO spawned: 'pscheduler-ticker' with pid 1435
2021-09-28 18:56:44,900 INFO success: pscheduler-archiver entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2021-09-28 18:56:44,900 INFO exited: pscheduler-runner (exit status 1; not expected)
Updating supervisor resolved this issue on my images using supervisord and docker 20,10,1+.
Must be a version after this merge to the API supervisor uses:
https://github.com/docker/docker-py/commit/1757c974fa3a05b0e9b783af85242b18df09d05d
You may have to install around yum repo using python3 pip in the dockerfile.
Experienced exactly the same issue. Using the systemd based image now.
Fixes for this applied to the 5.0.0 branch. This let's supervisord manage the process instead of using --daemon
options. Look at /etc/supervisord.conf
for changes.
Ignacio Peluaga Lozada writes:
Internet2 saw this as well. The runner service fails to start.