helidon-io / helidon

Java libraries for writing microservices
https://helidon.io
Apache License 2.0
3.44k stars 562 forks source link

WebClientSecurity::handle throws NPE when handling a untraced request (SpanContext for parent is null) #8800

Closed RickyFrost closed 1 month ago

RickyFrost commented 1 month ago

Problem Description

Our SE service receives a request that is configured as untraced, so no Span will be current during the request. For example:

tracing:
  enabled: true
  service: "name-here"
  paths:
    - path: /untracedPath
      enabled: false

And during the request, our code uses a JerseyClient with connector to helidon WebClient to make a request to a dependent service. It just so happens to be in the authorize code path of the security provider to get an authorize decision, but I believe it would happen any other time as well. And the resulting NPE stack is thrown. See attachment for actual stack.

Steps to reproduce

Run any web client request on code that is in the tracing config as enabled=false, and make a web client request.

untracedPath-using-WebClient-gets-NPE-spanContext.txt

RickyFrost commented 1 month ago

https://github.com/helidon-io/helidon/blob/c06dc1fb0d99f6822658f08c126b9e0a6b139b84/webclient/security/src/main/java/io/helidon/webclient/security/WebClientSecurity.java#L116

tjquinno commented 1 month ago

It's also possible that, if all of tracing is disabled, the Tracer might not be in the context either so there could be another source of an NPE at line 114.

The two uses of the span variable later in the method need to be handled slightly differently as well because those uses and/or downstream code will not tolerate a null span.

RickyFrost commented 1 month ago

We were likely dodging this bullet until the following fix went in... https://github.com/helidon-io/helidon/issues/8573

RickyFrost commented 1 month ago

Tim, what you say above may be true in general. But in our case we always run with tracing enabled, it's just some paths are disabled, like all the K8s probes (health, metrics, etc). So for us, as long as the Helidon code properly handles that case it will be sufficient for us. But for the product as a whole you have that consideration when making the fix. The point of this bug is, the disabling of paths is a tracing feature, and so must be fully supported for all possible paths through the Helidon code.

tjquinno commented 1 month ago

Ricky, I was not offering a competing explanation for what you are seeing. Just advice to whoever will work on the fix.

RickyFrost commented 1 month ago

Absolutely understood, just adding more color commentary :-)