JulianFP / project-W-runner

This is the code for the runners running the whisper jobs themselves which are managed by the project W flask backend
GNU Affero General Public License v3.0
2 stars 0 forks source link

Either of the two threads can crash leaving the other up running (and thus preventing an automatic restart) #20

Closed JulianFP closed 3 weeks ago

JulianFP commented 1 month ago

One thread sends the heartbeats, the other processes the current whisper job. If the former crashes, then the runner will stay up and running as long as the job takes to complete, however since it will not send any heartbeats it will be offline to the backend and cannot submit its job when its finally done. If the latter crashes, then it might happen that the runner is in a state where it still sends heartbeats and thus is online to the backend, but cannot process jobs because the variable that tells it if it currently is already processing one is still set. Both occurrences are not theoretical but have been observed in the wild.