fablabbcn / smartcitizen-api

The Smart Citizen Engine
https://developer.smartcitizen.me
GNU Affero General Public License v3.0
10 stars 4 forks source link

Kairos exits in staging #268

Closed oscgonfer closed 7 months ago

oscgonfer commented 11 months ago

Due to the limited size of the staging machine, we believe that cassandra overloads and Kairos is not able to reach it. This provokes it to exit after a certain amount of attempts.

We should review the conditions under which this happens and then adjust the docker-compose.yml in: https://github.com/fablabbcn/smartcitizen-api/pull/248

oscgonfer commented 10 months ago

After some checks of the configuration from KairosDB:

https://raw.githubusercontent.com/kairosdb/kairosdb/develop/src/main/resources/kairosdb.conf

When start_async is set to true a background thread is created to try and

connect to cassandra when starting up Kairos. This allows Kairos to start

even if Cassandra is not yet available. The background thread repeatedly

attempts to connect every 1sec until it is successful.

Setting start_async to false means kairos will fail to start if Cassandra

is not available.

start_async: false

After looking around in how we use this in our config, seems like we don't. @timcowlishaw potentially to consider in relationship to the compose changes you did?

timcowlishaw commented 10 months ago

Aha, maybe adding some docker dependencies and health checks might help with this? I'll give it a go :-)

timcowlishaw commented 10 months ago

Here's an example for further discussion and testing: https://github.com/fablabbcn/smartcitizen-api/pull/283

oscgonfer commented 8 months ago

As discussed via phone, this is still probably to be improved by kairos checking the health status of telnet-task to avoid falling over in staging.

oscgonfer commented 7 months ago

This has been solved by increasing the size of the machine and by all changes made in https://github.com/fablabbcn/smartcitizen-api/pull/283 and other compose related PRs