quarkusio / quarkus

Quarkus: Supersonic Subatomic Java.
https://quarkus.io
Apache License 2.0
13.35k stars 2.56k forks source link

Quarkus OIDC makes first request very slow #41452

Open wesleysalimansdvb opened 1 week ago

wesleysalimansdvb commented 1 week ago

Describe the bug

Hi,

We use Quarkus OIDC for multi tenancy. We do not use Keycloak, we use our own authentication system. Therefore we set quarkus.oidc.discovery-enabled to false, which results in our OIDC config not getting loaded on startup, but on the first request. This makes the first request that gets done to our container (512 vCPU, 1024 memory) very slow (around 4 seconds).

Is there a way to make a healthcheck manually load the OIDC providers before the first request gets done to the container? Or could there be added a new configuration property to load the providers on startup (such as proposed in https://github.com/quarkusio/quarkus/issues/34890)

quarkus.oidc.enabled=true
quarkus.oidc.discovery-enabled=false
quarkus.oidc.use-blocking-dns-lookup=true
quarkus.oidc.bca.public-key=${OUR_PUBLIC_KEY}
quarkus.oidc.bca.token.audience=bsp
quarkus.oidc.bca.token.issued-at-required=false
quarkus.oidc.bca.application-type=service
quarkus.oidc.bca.token.signature-algorithm=rs256
quarkus.oidc.auth-server-url=${JWKS_LOCATION}
quarkus.oidc.jwks-path=${JWKS_LOCATION}
quarkus.oidc.token.issued-at-required=false
quarkus.oidc.token.signature-algorithm=es256
quarkus.oidc.application-type=service

We also have our own TenantResolverConfig, to get the tenant config based on the tenant:

@ApplicationScoped
public class TokenTenantResolver implements TenantConfigResolver {
  private static final String ISSUER_PATH = "iss";

  @ConfigProperty(name = "quarkus.oidc.bca.token.issuer")
  String bcaIssuer;

  @ConfigProperty(name = "quarkus.oidc.bca.public-key")
  String bcaPublicKey;

  @SneakyThrows
  public Uni<OidcTenantConfig> resolve(
      RoutingContext context, OidcRequestContext<OidcTenantConfig> requestContext) {
    final var jwtString = context.request().headers().get("Authorization");

    if (jwtString == null) {
      return Uni.createFrom().nullItem();
    }

    final var jwtHeaders = io.quarkus.oidc.runtime.OidcUtils.decodeJwtContent(jwtString);

    if (jwtHeaders != null && jwtHeaders.getValue(ISSUER_PATH).equals(bcaIssuer)) {
      OidcTenantConfig config = new OidcTenantConfig();
      config.setTenantId("bca");
      config.setPublicKey(bcaPublicKey);
      config.getToken().setIssuedAtRequired(false);
      config.setApplicationType(OidcTenantConfig.ApplicationType.SERVICE);
      return Uni.createFrom().item(config);
    } else {
      // resolve to default tenant config
      return Uni.createFrom().nullItem();
    }
  }
}

Expected behavior

All the fields in our config for our OIDC tenants are "deterministic" as in, they could easily be loaded on startup. So initializing them on startup would be our expected behavior

Actual behavior

The OIDC config does not get loaded on startup, resulting in intializing all OIDC classes etc on the first request, which makes the first request very slow.

How to Reproduce?

No response

Output of uname -a or ver

AWS ECS Fargate container

Output of java -version

openjdk 21.0.1 2023-10-17 LTS OpenJDK Runtime Environment Corretto-21.0.1.12.1 (build 21.0.1+12-LTS) OpenJDK 64-Bit Server VM Corretto-21.0.1.12.1 (build 21.0.1+12-LTS, mixed mode, sharing)

Quarkus version or git rev

3.11.0

Build tool (ie. output of mvnw --version or gradlew --version)

docker

Additional information

No response

quarkus-bot[bot] commented 1 week ago

/cc @pedroigor (oidc), @sberyozkin (oidc)

gsmet commented 1 week ago

Any chance you could get some profiling done?

We have some guidance here: https://github.com/quarkusio/quarkus/blob/main/TROUBLESHOOTING.md .

wesleysalimansdvb commented 1 week ago

I do not have the time right now to do any profiling, maybe these CloudWatch logs give you some insights? oidc-initalizing.xlsx

wesleysalimansdvb commented 1 week ago

My "main" question right now is (as a quick hack): How can i manually initialize the OIDC client so it is ready for my first request (via a healthcheck or something).

gsmet commented 1 week ago

So https://quarkus.io/guides/lifecycle#startup_annotation is what should be used to execute something at startup but I have no idea what you should initialize.

Question for @michalvavrik or @sberyozkin .

sberyozkin commented 1 week ago

@wesleysalimansdvb Hi, 4 secs is a very long period of time, to get JWKS fetched at runtime should take less than a second.

Why do you think that disabling the discovery postpones pulling the keys until the 1st request's time ? It only skips a step which involves fetching the discovery metadata doc pointing to the JWKS, etc endpoint, but JWKS, by default, will be fetched immediately: https://github.com/quarkusio/quarkus/blob/main/extensions/oidc/runtime/src/main/java/io/quarkus/oidc/runtime/OidcRecorder.java#L404

I wonder if quarkus.oidc.use-blocking-dns-lookup=true is a culprit here ? It is only required for slow systems to avoid a blocked IO thread problem when a connection to the provider is being established.

As a side note, you may want to experiment with https://quarkus.io/version/main/guides/security-openid-connect-multitenancy#issuer-based-tenant-resolver, to avoid having to create a custom resolver above which parses the token itself...

wesleysalimansdvb commented 1 week ago

quarkus.oidc.use-blocking-dns-lookup=true was added because we kept getting Vert.x blocked thread exceptions.

2024-06-24 07:13:24,827 WARN [io.ver.cor.imp.BlockedThreadChecker] (vertx-blocked-thread-checker) Thread Thread[vert.x-eventloop-thread-1,5,main] has been blocked for 3160 ms, time limit is 2000 ms: io.vertx.core.VertxException: Thread blocked

After enabling this property those exceptions are gone. 4 secs indeed is a long time, but that is what our logging says how long it takes to intialize all classes, algorithms and parse the token.

Thanks for the sidenote, this is something we will definitely try out!

sberyozkin commented 1 week ago

@wesleysalimansdvb So that stack trace says that it takes 3.16+ seconds to establish a connection with your provider. Sounds like this is what is happening, you get something close to 4 secs to have it established. You can try any of the quickstarts with live Keycloak and with the disabled discovery and I can guarantee you it won't take 4 seconds to handle a first request

sberyozkin commented 1 week ago

IMHO it is about your dev env's setup than anything else, a lot of users have the discovery disabled for endpoints, most of the OAuth2 social providers don't support it. Please prove it otherwise with the profiling if you'd like to have some progress on this issue

sberyozkin commented 1 week ago

@wesleysalimansdvb

Thanks for the sidenote, this is something we will definitely try out!

Sure, give it a try please and let Michal and myself know how it goes

sberyozkin commented 5 days ago

@wesleysalimansdvb Can you temporarily register a test client application and share the details with me offline (at Zulip, or post them to sbiarozk at redhat dot com with CC to someone else in your team) ? I can try it from my laptop to check what is going on there. I'd be worried even about 2 secs if I ever saw it happening in such cases, but I did not see it...

bartm-dvb commented 2 days ago

Hi @sberyozkin, I want to give you a headsup that @wesleysalimansdvb is out of office until next week. He will come back to you when he returns.

sberyozkin commented 2 days ago

Hi @bartm-dvb, thanks for the heads up, sure, lets continue next week or so and see what we can realistically do at the Quakus level