erlef / oidcc

OpenId Connect client library in Erlang & Elixir
https://hexdocs.pm/oidcc
Apache License 2.0
180 stars 47 forks source link

`oidcc_provider_configuration_worker` backoff #324

Closed paulswartz closed 8 months ago

paulswartz commented 10 months ago

oidcc version

3.1.0

Erlang version

unsure

Elixir version

unsure

Summary

Migrated from https://gitlab.com/paulswartz/ueberauth_oidcc/-/issues/7:

We have an up and coming Open ID connect provider, that goes into maintenance for some hours multiple times per year.

We have run into an issue where a downtime on the end of the Open ID connect provider caused a downtime to our service.

journalctl log:

Dec 30 16:56:26 Ubuntu-2004-focal-64-minimal tp[720278]: 15:56:26.487 pid=<0.3212.0>  [error] GenServer :zitadel terminating
Dec 30 16:56:26 Ubuntu-2004-focal-64-minimal tp[720278]: ** (stop) {:configuration_load_failed, {:http_error, 404, "\n<html><head>\n<meta http-equiv=\"content-type\" content=\"text/html;charset=utf-8\">\n<title>404 Page not found</title>\n</head>\n<body text=#000000 bgcolor=#ffffff>\n<h1>Error: Page not found</h1>\n<h2>The requested URL was not found on this server.</h2>\n<h2></h2>\n</body></html>\n"}}
Dec 30 16:56:26 Ubuntu-2004-focal-64-minimal tp[720278]: Last message: {:continue, :load_configuration}
Dec 30 16:56:26 Ubuntu-2004-focal-64-minimal tp[720278]: State: {:state, {:oidcc_provider_configuration, "https://redacted.zitadel.cloud", "https://redacted.zitadel.cloud/oauth/v2/authorize", "https://redacted.zitadel.cloud/oauth/v2/token", "https://redacted.zitadel.cloud/oidc/v1/userinfo", "https://redacted.zitadel.cloud/oauth/v2/keys", :undefined, ["openid", "profile", "email", "phone", "address", "offline_access"], ["code", "id_token", "id_token token"], ["query", "fragment"], ["authorization_code", "implicit", "refresh_token", "client_credentials", "urn:ietf:params:oauth:grant-type:jwt-bearer", "urn:ietf:params:oauth:grant-type:device_code"], :undefined, [:public], ["RS256"], :undefined, :undefined, :undefined, :undefined, :undefined, ["RS256"], :undefined, :undefined, ["none", "client_secret_basic", "client_secret_post", "private_key_jwt"], ["RS256"], :undefined, [:normal], ["sub", "aud", "exp", "iat", "iss", "auth_time", "nonce", "acr", "amr", "c_hash", "at_hash", "act", "scopes", "client_id", "azp", "preferred_username", "name", "family_name", "given_name", "locale", "email", ...], :undefined, :undefined, ["bg", "cs", "de", "en", "es", "fr", "it", "ja", "mk", "nl", "pl", "pt", "ru", "zh"], false, true, false, false, :undefined, :undefined, "https://redacted.zitadel.cloud/oauth/v2/revoke", ["none", "client_secret_basic", "client_secret_post", "private_key_jwt"], ["RS256"], "https://redacted.zitadel.cloud/oauth/v2/introspect", ["client_secret_basic", "private_key_jwt"], ["RS256"], ["S256"], "https://redacted.zitadel.cloud/oidc/v1/end_session", %{"device_authorization_endpoint" => "https://redacted.zitadel.cloud/oauth/v2/device_authorization"}}, {:jose_jwk, {:jose_jwk_set, []}, :undefined, %{}}, "https://redacted.zitadel.cloud", %{}, :undefined, :undefined, :zitadel_table}
Dec 30 16:56:26 Ubuntu-2004-focal-64-minimal tp[720278]: 15:56:26.501 pid=<0.4768635.0>  [error] GenServer :zitadel terminating
Dec 30 16:56:26 Ubuntu-2004-focal-64-minimal tp[720278]: ** (stop) {:configuration_load_failed, {:http_error, 404, "\n<html><head>\n<meta http-equiv=\"content-type\" content=\"text/html;charset=utf-8\">\n<title>404 Page not found</title>\n</head>\n<body text=#000000 bgcolor=#ffffff>\n<h1>Error: Page not found</h1>\n<h2>The requested URL was not found on this server.</h2>\n<h2></h2>\n</body></html>\n"}}
Dec 30 16:56:26 Ubuntu-2004-focal-64-minimal tp[720278]: Last message: {:continue, :load_configuration}
Dec 30 16:56:26 Ubuntu-2004-focal-64-minimal tp[720278]: State: {:state, :undefined, :undefined, "https://redacted.zitadel.cloud", %{}, :undefined, :undefined, :zitadel_table}
Dec 30 16:56:26 Ubuntu-2004-focal-64-minimal tp[720278]: 15:56:26.515 pid=<0.4768693.0>  [error] GenServer :zitadel terminating
Dec 30 16:56:26 Ubuntu-2004-focal-64-minimal tp[720278]: ** (stop) {:configuration_load_failed, {:http_error, 404, "\n<html><head>\n<meta http-equiv=\"content-type\" content=\"text/html;charset=utf-8\">\n<title>404 Page not found</title>\n</head>\n<body text=#000000 bgcolor=#ffffff>\n<h1>Error: Page not found</h1>\n<h2>The requested URL was not found on this server.</h2>\n<h2></h2>\n</body></html>\n"}}
Dec 30 16:56:26 Ubuntu-2004-focal-64-minimal tp[720278]: Last message: {:continue, :load_configuration}
Dec 30 16:56:26 Ubuntu-2004-focal-64-minimal tp[720278]: State: {:state, :undefined, :undefined, "https://redacted.zitadel.cloud", %{}, :undefined, :undefined, :zitadel_table}
Dec 30 16:56:26 Ubuntu-2004-focal-64-minimal tp[720278]: 15:56:26.529 pid=<0.4768592.0>  [error] GenServer :zitadel terminating
Dec 30 16:56:26 Ubuntu-2004-focal-64-minimal tp[720278]: ** (stop) {:configuration_load_failed, {:http_error, 404, "\n<html><head>\n<meta http-equiv=\"content-type\" content=\"text/html;charset=utf-8\">\n<title>404 Page not found</title>\n</head>\n<body text=#000000 bgcolor=#ffffff>\n<h1>Error: Page not found</h1>\n<h2>The requested URL was not found on this server.</h2>\n<h2></h2>\n</body></html>\n"}}
Dec 30 16:56:26 Ubuntu-2004-focal-64-minimal tp[720278]: Last message: {:continue, :load_configuration}
Dec 30 16:56:26 Ubuntu-2004-focal-64-minimal tp[720278]: State: {:state, :undefined, :undefined, "https://redacted.zitadel.cloud", %{}, :undefined, :undefined, :zitadel_table}
Dec 30 16:56:26 Ubuntu-2004-focal-64-minimal tp[720278]: 15:56:26.530 pid=<0.2737.0>  [notice] Application ueberauth_oidcc exited: shutdown
Dec 30 16:56:43 Ubuntu-2004-focal-64-minimal tp[720278]: Kernel pid terminated (application_controller) ("{application_terminated,ueberauth_oidcc,shutdown}")
Dec 30 16:56:43 Ubuntu-2004-focal-64-minimal tp[720278]: 
Dec 30 16:56:43 Ubuntu-2004-focal-64-minimal tp[720278]: Crash dump is being written to: erl_crash.dump...done
Dec 30 16:56:43 Ubuntu-2004-focal-64-minimal systemd[1]: redacted.service: Main process exited, code=exited, status=1/FAILURE
Dec 30 16:56:43 Ubuntu-2004-focal-64-minimal systemd[1]: redacted.service: Failed with result 'exit-code'.
Dec 30 16:56:44 Ubuntu-2004-focal-64-minimal systemd[1]: redacted.service: Scheduled restart job, restart counter is at 1.

Current behavior

When the issuer is offline, oidcc_provider_configuration_worker crashes with an http_error.

How to reproduce

{ok, Pid} = oidcc_provider_configuration_worker:start_link(#{ issuer => "https://invalid.example" }),
true = is_process_alive(Pid).

Expected behavior

oidcc_provider_configuration_worker does not crash. My initial thought would be that the various functions return provider_not_ready, but that doesn't look like it's part of the spec. Maybe the functions themselves still crash, but the worker itself does not?

maennchen commented 10 months ago

I wouldn't classify this as a bug since this is behaving as intended.

The idea was to handle issues with the normal supervision tree. It also is quite normal to handle this that way for workers that reference external services.

I would however be open to a retry strategy for the user to specify some behavior on how this should be dealt with.

For this db_connections backoff options come to mind:

https://github.com/elixir-ecto/db_connection/blob/fa5f705fa5d272ed28b64ee0954e4275c0260d36/lib/db_connection.ex#L383

* `:backoff_min` - The minimum backoff interval (default: `1_000`)
* `:backoff_max` - The maximum backoff interval (default: `30_000`)
* `:backoff_type` - The backoff strategy, `:stop` for no backoff and
to stop, `:exp` for exponential, `:rand` for random and `:rand_exp` for
random exponential (default: `:rand_exp`)
maennchen commented 8 months ago

Closed in favor of PR #337