bstansell / conserver

Logged, multi-user access to device consoles
https://www.conserver.com/
BSD 3-Clause "New" or "Revised" License
129 stars 38 forks source link

Unexpected behavior when exec() commands for console error out immediately #102

Open MitchellAugustin opened 6 months ago

MitchellAugustin commented 6 months ago

In our lab, we have ~30 consoles defined in our conserver.cf. We recently observed that when some of those consoles' exec() calls error out as conserver tries to bring them up, many of the other consoles will not be brought up either.

I created a dummy conserver.cf using cat in place of our successful consoles and (exit 1 || sleep 10) in place of the failing ones to replicate this behavior, and consistently, upon restarting conserver, I observe that only 10 of the consoles are brought up by conserver. image

On the contrary, when I change these failing exec()s to sleep 10\; exit 1, all of the consoles come up as expected on their own after about a minute.

This behavior leads me to believe this issue is only present when some consoles error out immediately.

MitchellAugustin commented 6 months ago

A similar issue also seems to be present if I change the failed console's exec to exec(exit 1), then set -F as a conserver option.

My understanding of -F is that it should prevent recurring attempts each minute to bring up failed consoles (and that a "failed" console is any console with a nonzero exit code). While -F does eliminate the "automatic reinitialization" messages, it does not prevent the actual exec(), so I see

[Tue Mar  5 21:08:04 2024] conserver (556572): [fail1] console up
[Tue Mar  5 21:08:04 2024] conserver (556572): [fail1] exit(1)
[Tue Mar  5 21:08:05 2024] conserver (556580): [fail1] console up
[Tue Mar  5 21:08:05 2024] conserver (556580): [fail1] exit(1)
...

infinitely.