For our Docker setup, we use the horizon:status command for the health check. So if Horizon fails to load or crashes, it starts a new container.
Now we pause the running Horizon processes before we push an update. This way we can update the web image and launch a new Horizon image, without an old Horizon image complaining about jobs it doesn't understand (new job created in new image code for example).
But now that we added the health checks for the Horizon images too, the image is being replaced with a new instance of the old image immediately, because the paused status triggers the failing exit code of the status command (which is oke, because it is not running indeed). To determine if the Horizon is just paused or actually inactive, it would be nice if we had different exit codes. This way we can check for exit code 2 to relaunch the existing image in a new container. and leave the paused containers "running".
This could break other peoples script, based on a specific exit code, tho a basic non success check should suffice for the old and new way. So I'm not sure if we can push this without any minor version increase or whatsoever.
For our Docker setup, we use the
horizon:status
command for the health check. So if Horizon fails to load or crashes, it starts a new container.Now we pause the running Horizon processes before we push an update. This way we can update the web image and launch a new Horizon image, without an old Horizon image complaining about jobs it doesn't understand (new job created in new image code for example).
But now that we added the health checks for the Horizon images too, the image is being replaced with a new instance of the old image immediately, because the paused status triggers the failing exit code of the status command (which is oke, because it is not running indeed). To determine if the Horizon is just paused or actually inactive, it would be nice if we had different exit codes. This way we can check for exit code 2 to relaunch the existing image in a new container. and leave the paused containers "running".
This could break other peoples script, based on a specific exit code, tho a basic non success check should suffice for the old and new way. So I'm not sure if we can push this without any minor version increase or whatsoever.