Round trips on discovery and HTTP caching

blankhart commented 4 years ago

Thank you for this great library. I am new to this so may have misunderstood the flows that are normal here.

If I understand the client API, a CoreClient has limited lifespan because the identity provider's underlying configuration may change underneath it, for instance if the identity provider rotates its keys each day.

To deal with this, the best practice seems to be to cache the discovery document and JWKS responses based on HTTP cache-control header max-age value, in order to avoid the need to revalidate that information in connection with each authentication.

For example, the Google OpenID Connect writeup says that:

You may be able to avoid an HTTP round-trip by caching the values from the Discovery document. Standard HTTP caching headers are used and should be respected.

The caching headers in the responses do not seem to be exposed via openidconnect::ProviderMetadata::discover or other high-level functions that are currently provided by the library.

Is the best way to get it currently to re-implement functionality similar to discover/discover_async and capture it through the HTTP client? Is there a recommended practice for keeping clients current in lieu of doing this?

ramosbugs commented 4 years ago

Hey, great question!

You're correct that this crate doesn't implement any caching functionality. Since OIDC Discovery relies on HTTP headers for its caching functionality, it feels like the caching is best handled at the HTTP client layer. One way I've solved this in another (proprietary, closed source unfortunately) codebase is to implement a wrapper around the reqwest client that implements the relevant portions of RFC 7234 and returns a cached response when appropriate. Then, simply pass that client to discover/discover_async, and the client will get up-to-date provider metadata that observes the caching headers.

With this approach, I recommend making each CoreClient short-lived (e.g., for server-side use cases, the duration of a single HTTP request). As long as the HTTP client is caching the discovery responses, this shouldn't introduce much overhead and ensures that the underlying metadata never becomes stale.

Since this is a pretty widespread need, I think the Rust ecosystem would benefit either from adding some caching functionality into reqwest itself, or by having a separate wrapper crate that adds caching (similar to what the https://crates.io/crates/static-http-cache crate looks like it aims to do). Re-implementing caching in each higher-level protocol crate seems like it would lead to a lot more bugs, code bloat, etc.

blankhart commented 4 years ago

That answer is elegant and makes a lot of sense for the library. If I understand it though, the current example does not follow the architectural recommendation of making each CoreClient short-lived to the extent of a single HTTP request. Understanding that the example is a simplification, it might be helpful to include a note.

In the example, the same CoreClient is reused when redirecting to the identity provider and when receiving the redirect from the identity provider, whereas using the strategy recommended in this issue, the CoreClient would be re-created in each case in reliance on the HTTP caching mechanism to ensure they are the same.

With approach in this issue, if the HTTP cache aged out between redirects, the CoreClient theoretically might be different when initiating and finishing the flow. I haven't studied this or the protocol enough to know if that could result in a false negative, granting of course that it would probably be a very rare occurrence in practice.

Timshel commented 9 months ago

Hey,

Wanted to mention that the HTTP client cache handling which would be the cleanest will probably not work with many providers. Ex:

Keycloak return no-cache on the discovery and keys endpoint (there is opened issues on the subject but no response).
Gitlab return max-age=0 on the discovery but does return max-age=86400 for the keys

I end up using a Moka cache with a configurable expiration. Are you aware of any other potential issues outside of the rollings keys ?

ramosbugs commented 9 months ago

Wanted to mention that the HTTP client cache handling which would be the cleanest will probably not work with many providers.

That's disappointing. Thanks for posting your findings!

I end up using a Moka cache with a configurable expiration. Are you aware of any other potential issues outside of the rollings keys ?

Any of the metadata values (keys, endpoint URLs, and allowed ID token signing algs) could theoretically change, and caching them forever might cause problems if providers ever change configurations. Unfortunately, HTTP cache headers are the means for providers to communicate those guarantees, and it's too bad that they're not leveraging it.

Separately, I've come across https://crates.io/crates/http-cache-reqwest recently, which looks like it might work as a drop-in HTTP caching layer for the reqwest client. This should be easier to accomplish soon in the 4.0 release of this crate, which will accept reqwest::Client as an HTTP client instead of the current function-based interface (see upstream changes in https://github.com/ramosbugs/oauth2-rs/blob/main/UPGRADE.md#use-stateful-http-clients).

ramosbugs / openidconnect-rs

Round trips on discovery and HTTP caching #25