perfsonar / perfsonar-testpoint-docker

Apache License 2.0
12 stars 15 forks source link

Let supervisord manage daemons for bugfix #26

Closed MiddelkoopT closed 2 years ago

MiddelkoopT commented 2 years ago

Fix for crashing runner. Container was unusable without this fix.

Simple jobs like pscheduler task idle --duration PT2S would not run and fail with

Run has not completed.

No jobs would run unless either a "print" or log statement was put at the top of /usr/libexec/pscheduler/daemons/runnerinmain_program()` due to crashing runner. This fix seems like it is the proper approach to running daemons with supervisord. This smells like a race condition and this version still does not capture output from the runner daemon but "fixes" the problems.

Without this fix, and running the runner in a separate shell, it will get the following error.

Traceback (most recent call last):
  File "/usr/libexec/pscheduler/daemons/runner", line 1070, in <module>
    main_program()
  File "/usr/libexec/pscheduler/daemons/runner", line 949, in main_program
    db.notifies.pop(0)
IndexError: pop from empty list
mfeit-internet2 commented 2 years ago

This fix seems like it is the proper approach to running daemons with supervisord. This smells like a race condition and this version still does not capture output from the runner daemon but "fixes" the problems.

The --daemon switch in the services is a holdover from the SysV init days that we no longer use with systemd. I've left it in there in case we ever decide to field it on a system that doesn't have systemd. Letting supervisord deal with it as a foreground process rather than daemonized makes more sense.

I've never seen IndexError situation on bare metal or VMs, but the call to pop() does make an invalid assumption about the length of the notification list. That's been fixed and is being tested as we sepak.

mfeit-internet2 commented 2 years ago

Ugh. Just noticed that was merged into master when it should have been merged into 5.0.0. I'll take care of that.

MiddelkoopT commented 2 years ago

For future reference - which branch should be targeted?

mfeit-internet2 commented 2 years ago

Usually it's the branch with the highest version number, which is what's in line for the next release. This repo hasn't been properly branched for development. That'll get fixed, too.

What's in master reflects what we consider to be the production version.

How we do things:

MiddelkoopT commented 2 years ago

Great - let me know if you want me to create a new PR if it's easier that way.