Closed n1nj4888 closed 3 years ago
This is intended as influx is critical for the operation. Please ensure that influx is started before varken.
I understand that influx is critical for varken but consider the scenario where the physical Docker node reboots. Influx container takes FAR longer to start than varken so varken simply tries once and then exits with an exit code of 0 (ok)... since it exits with exit code 0, the Docker engine does not attempt to restart the container.
The issue is that varken should be exiting with a non-zero exit code if it exits with a critical error so that the Docker engine/swarm orchestrator can action it accordingly. This is how many other containers work...
@samwiseg0 He is right about the exit code. Offending line: https://github.com/Boerderij/Varken/blob/master/varken/dbmanager.py#L23 Documentation: https://docs.python.org/3/library/sys.html#sys.exit Relevant snippet from documentation:
The optional argument arg can be an integer giving the exit status (defaulting to zero)
Proposed resolution: exit(1)
Yep. I will look at fixing it in develop
Hi @samwiseg0. Any update on this issue since it doesn’t seem to have been fixed yet in develop? Thanks!
Hi there,
When the varken container starts before InfluxDB is ready and cannot contact InfluxDB, varken fails with the following log entries:
2020-04-09 09:17:02 : INFO : Varken : Starting Varken... 2020-04-09 09:17:02 : INFO : Varken : Data folder is "/config" 2020-04-09 09:17:02 : INFO : Varken : Linux 5.3.0-46-generic (#38~18.04.1-Ubuntu SMP Tue Mar 31 04:17:56 UTC 2020 - Alpine Linux 3.10.0) 2020-04-09 09:17:02 : INFO : Varken : Python 3.7.3 (default, Jun 27 2019, 22:53:21) [GCC 8.3.0] 2020-04-09 09:17:02 : INFO : Varken : Varken v1.7.6-master 2020-04-09 09:17:02 : INFO : helpers : SONARR_SERVER_IDS : [1] 2020-04-09 09:17:02 : INFO : helpers : RADARR_SERVER_IDS : [1] 2020-04-09 09:17:02 : INFO : iniparser : LIDARR_SERVER_IDS disabled. 2020-04-09 09:17:02 : INFO : iniparser : OMBI_SERVER_IDS disabled. 2020-04-09 09:17:02 : INFO : helpers : TAUTULLI_SERVER_IDS : [1, 2] 2020-04-09 09:17:02 : INFO : iniparser : SICKCHILL_SERVER_IDS disabled. 2020-04-09 09:17:02 : INFO : iniparser : UNIFI_SERVER_IDS disabled. 2020-04-09 09:17:02 : CRITICAL : dbmanager : Error testing connection to InfluxDB. Please check your url/hostname
Although the error was deemed "CRITICAL", the container exits with an error code of 0 as per the following portainer Inspect details on the stopped container:
State Dead false Error ExitCode 0 FinishedAt 2020-04-09T01:17:02.371494764Z OOMKilled false Paused false Pid 0 Restarting false Running false StartedAt 2020-04-09T01:17:01.096798522Z Status exited
The issue here is that I believe an ExitCode 0 is not marked as a failure and therefore if the varken service is setup with a swarm restart_policy of "on-failure", the docker swarm managers will not attempt to restart the container ...