Open atrauzzi opened 1 week ago
An explicit flag to instruct FusionAuth to retry its database connection for a certain period/interval
We currently support this via theDATABASE_CONNECTION_TIMEOUT
, as documented here: https://fusionauth.io/docs/reference/configuration
An explicit flag to instruct FusionAuth to terminate when it cannot connect to the database
Do you mean a configuration parameter that, when set, causes FusionAuth to fail hard and refuse requests when it can't reach a database?
Hmm, a "timeout" typically means how long before the connection is failed. Either way, I don't think the discrepancy here is around whether there is something triggering timeout behaviour. Although just for posterity, I've tried setting FUSIONAUTH_DATABASE_CONNECTION_TIMEOUT
to 30000
, just to see what happens:
The server goes into silent configuration mode and never recovers, and then never runs my kickstart.
Now, if I do the following temporary workaround:
bash
-c sleep 30 && /usr/local/fusionauth/fusionauth-app/bin/start.sh
After thirty seconds, FusionAuth starts and is able to connect to the database. So I know I have the potential for a working FusionAuth configuration and that there's nothing wrong with my setup. It's merely a matter of convincing FusionAuth to actually retry.
So, coming back to what you mention above - and provided I have the config value name correct - I'm not sure the timeout is either working or necessarily the right solution.
The central question here may not even be around whether there is a timeout configured, but more around what happens after a timeout.
In this, FusionAuth could present a number of behaviours, but ideally it might be good to allow people to pick which one is best for their environment. Especially when you include the first time out of box setup experience that FusionAuth offers, which in my scenario is actually not helpful because I'm using kickstart and API calls to complete configuration non-interactively.
Thanks @atrauzzi .
I did some digging and it looks like the FUSIONAUTH_DATABASE_CONNECTION_TIMEOUT variable doesn't apply at startup when we're in maintenance mode, trying to find a database to connect to, only to connections after startup, when the database connection is managed by our connection pool.
We do try to reconnect multiple times when starting up but I'll take a closer look.
Better support for resilience-oriented setups
Problem
Not all orchestrators or environments support the notion of dependent services to ensure that things like database come up prior to FusionAuth starting up.
The approach to how this is handled varies by community and one competing school of thought to specifying dependencies is to rely on restarts and/or retries.
Unfortunately, when FusionAuth fails to get a lock on a database in silent, maintenance mode, it does not terminate or make any retry attempts.
This means that an environment that wishes to handle resilience through restarts or in-built retry mechanisms (or both!) has no way of guiding FusionAuth to a working state.
Solution
All or some combination of:
Optionally, one of these could also be the default when running with
FUSIONAUTH_APP_SILENT_MODE
set totrue
.Alternatives/workarounds
Unfortunately there is no alternative or workaround. In some ways, the premise of resilience is itself a workaround oriented approach.
Community guidelines
All issues filed in this repository must abide by the FusionAuth community guidelines.
How to vote
Please give us a thumbs up or thumbs down as a reaction to help us prioritize this feature. Feel free to comment if you have a particular need or comment on how this feature should work.