Open elondaits opened 6 years ago
Interactive authentication required.
is now fixed with recent hilbert-station
scriptConnection to 172.16.21.54 closed by remote host.
WARNING [hilbert_cli_config.py:278]: Error exit code 255, while executing 'ssh -q -F /root/SSH/config 172.16.21.54 hilbert-station -v shutdown now'!
ERROR [hilbert_cli_config.py:1435]: Could not run remote ssh command: 'ssh -q -F /root/SSH/config 172.16.21.54 hilbert-station -v shutdown now'! Return code: 255
WARNING [hilbert_cli_config.py:2266]: Could not schedule immediate shutdown on the station '172.16.21.54'
is a bit misleading since it is exactly the expected correct behavior: shutdown now
is supposed to immediately cut the current ssh connection to the station, and hilbert
just detects that ssh client was terminated abruptly. I am going to add a slightly scheduled shutdown instead of now
so that remote execution would be able to finalize correctly.
it seems that the smallest delay can be 1 minute (via +1
) which is also the default delay for shutdown
(If no time argument is specified, "+1" is implied.
)...
@porst17 @elondaits is it Ok for hilbert stop
to schedule remote station shutdown in a minute + correctly finalize the execution and to rely on station's shutdown
to actually schedule and perform the system shutdown?
Note that with a cut ssh connection we have a guaranty that shutdown has been actually started (we can detect the connection cutting)...
A 1 minute delay for a "cosmetic" problem is not a good idea. Also, making the process more complex makes it more likely to fail, and if it fails after the delay you still get no notification. I'd just remove the three errors (WARNING/ERROR/WARNING) it now produces because as you first said, it's not actually an error... so it shouldn't be reported as such.
You can schedule the delayed shutdown via nohup sh -c 'sleep 2s; shutdown -P now' &
(if 2s is long enough). It it not as robust as shutdown -P +1
, but the current shutdown -P now
just cuts the ssh connection and leaves you behind without any information if the shutdown was scheduled correctly. So I would argue that the nohup
method is on the same level of reliability as shutdown -P now
.
If you don't want to implement it this way, I also think it is OK to just silence the error messages in case of a shutdown -P now
.
I stopped bigfoot60 and there were some errors near the end of the procedure:
Full log: station fails to stop.txt