Submitty / Submitty

Homework Submission, Automated Grading, and TA grading system.
http://submitty.org
BSD 3-Clause "New" or "Revised" License
640 stars 772 forks source link

Automatically restart websocket server #10450

Open cjreed121 opened 4 months ago

cjreed121 commented 4 months ago

What problem are you trying to solve with Submitty The websocket server is a long lived PHP process which can occasionally fail. This process is only restarted during installs.

Describe the way you'd like to solve this problem We should add a cron job which restarts the web socket server every night.

Additional context It is possible that an installation could happen at a time which conflicts with this cron job. If an install happens, the cron job should be disabled or the cron job should be aware if an installation conflicts. If they conflict the socket server could start with incomplete code or dependencies which could cause undefined behavior.

MasterOdin commented 4 months ago

If the websocket server were to fail, I'd think it should be automatically restarted by systemd? Unless by fail, you mean the server keeps running, though the PHP process itself stops accepting new connections?

bmcutler commented 4 months ago

@MasterOdin

We had a websocket outage/failure this Spring. Websockets had been broken for a week or two at that point. Frustratingly no one reported it because everyone just assumed I knew! When I myself noticed that websockets were down, I did run through my debugging notes here: https://submitty.org/sysadmin/troubleshooting/system_debugging Sorry I didn't record screenshots, but things were offline but without the usual, obvious error messages. Restarting the socket server fixed the problem immediately. It was unsettling/unsatisfying to not know what had caused the problem, or what was wrong.

It was possibly the longest period the server was running without installing new software -- and without the stop/start of sockets/apache/etc. It was running for about 2 months between releases - we didn't need to install updates for urgently needed bugfixes or new features. I don't believe restarting it every night is necessary, but I think a once a week restart wouldn't hurt anything. Maybe other things (e.g., jobs daemon, apache, or even autograding queue) could/should be preventatively restarted as well.