TwiN / gatus

⛑ Automated developer-oriented status page
https://gatus.io
Apache License 2.0
6.46k stars 428 forks source link

Crash when OIDC is unavailable #830

Open xeruf opened 3 months ago

xeruf commented 3 months ago

Describe the bug

When OIDC is configured and the OIDC provider is unavailable (can happen transitorily on an upgrade for example), Gatus crashes.

What do you see?

image

What do you expect to see?

Maybe a landing page with an error message, and maybe it can retry a few times? Also what about a fallback administrative password or something?

List the steps that must be taken to reproduce this issue

No response

Version

5.11

Additional information

No response

TwiN commented 3 months ago

Hmm, this is probably happening because on start, IODC retrieves the .well-known metadata endpoint or something along those lines, and because it's not available, it just fails to initialize...

On one hand, I understand your suggestion, but on the other hand, this should only happen if Gatus' configuration is loading (on start or on hot-reload), and if Gatus is unable to start with the desired configuration, I think the proper behavior should be for Gatus to not start, because otherwise Gatus would be in a state where its running state is not an accurate reflection of its configuration.

It may sound like I'm giving excuses here, but to be honest, the fact that this happens is pretty much a security feature working as intended. If it just fell back to an HTTP basic user/pass prompt or even worse, no authentication at all, I think Gatus would lose a few points as far as compliance goes 😅

xeruf commented 1 month ago

that's why I think a few retries over a few minutes might be the best idea

TwiN commented 1 month ago

Maybe not over a few minutes because I think it's fine to rely on the orchestration mechanism to handle bubble up the issue if the issue lasts for long enough, but I wouldn't be against retrying after the first failure a few seconds later