opentracing-contrib / java-spring-jaeger

Apache License 2.0
256 stars 95 forks source link

[req] Eliminate unknown-spring-boot fallback in jaeger.service-name #122

Open dekimsey opened 3 years ago

dekimsey commented 3 years ago

The current behavior In JaegerConfigurationProperties.java#L45 is to fallback to a dummy value if the name property is not set correctly.

Instead, I would like to see an error issued regarding the library being misconfigured instead. Or perhaps disabling any reporting spans. Or an assertion. Possibly anything other than a valid but dummy value that hides the application in tracing. If opentracing.jaeger.enabled is enabled, we should require a valid value to be set for the application name

In our infrastructure, we end up seeing spans with these unknown-spring-boot entries that developers are unaware are misconfigured. There is a disconnect between the developer wiring their application, and the actual issue being recorded in tracing. There is no (obvious to me) indication that an application was misconfigured and is emitting these garbage application names to tracing. This makes follow-up with the owners very difficult and usually involves a lot of really frustrating chasing.

Thank you!

Possibly related issues:

geoand commented 3 years ago

We could add a configuration property that disables this fallback. Would you like to contribute this?

On Sat, Mar 6, 2021, 00:27 Daniel Kimsey notifications@github.com wrote:

The current behavior In JaegerConfigurationProperties.java#L45 https://github.com/opentracing-contrib/java-spring-jaeger/blob/a83e391a7f78fdf3d88094b7ed0e4854cd0a68fa/opentracing-spring-jaeger-starter/src/main/java/io/opentracing/contrib/java/spring/jaeger/starter/JaegerConfigurationProperties.java#L45 is to fallback to a dummy value if the name property is not set correctly.

Instead, I would like to see an error issued regarding the library being misconfigured instead. Or perhaps disabling any reporting spans. Or an assertion. Possibly anything other than a valid but dummy value that hides the application in tracing. If opentracing.jaeger.enabled is enabled, we should require a valid value to be set for the application name

In our infrastructure, we end up seeing spans with these unknown-spring-boot entries that developers are unaware are misconfigured. There is a disconnect between the developer wiring their application, and the actual issue being recorded in tracing. There is no (obvious to me) indication that an application was misconfigured and is emitting these garbage application names to tracing. This makes follow-up with the owners very difficult and usually involves a lot of really frustrating chasing.

Thank you!

Possibly related issues:

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/opentracing-contrib/java-spring-jaeger/issues/122, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABBMDP2KQWLMHONOXQ46MGLTCFLDPANCNFSM4YV7TL3A .

dekimsey commented 3 years ago

Well, a configuration to disable the fallback leads us to the same issue. The out of the box defaults will produce bad traces. I believe the library should be more strict with it's default configuration. In essence, it should require a valid name be set if it's enabled.

Right now, I'm chasing down multiple applications that are generating unknown traces and it's not fun trying to find out where or who is generating them!

I could try to contribute it. But I'm not a Java developer. If the change is simple enough, I may be able to do it. And if not, I'll ask a favor of a coworker! But first, I think agreeing on a solution would be ideal.

geoand commented 3 years ago

I don't really want to change the default behavior as it had been around since the inception of the library and folks haven't complained.

Opting in to being more strict sounds like the best way to go to me. The change should be simple to make.

dekimsey commented 3 years ago

The issue we've found with an opt-in behavior is that simply put, we haven't fixed anything. If the client is misconfigured (never declared a service name), how is adding another flag to require the name flags be set going to address the root cause?

I've got hundreds of deployed instances from many different teams at different revisions. Tracking down these "unknown-spring-boot" instances is an operational nightmare and worse, not something I can fix. I then have to track down the application owners and point them to the docs and convince them to fix it. It's frustrating to say the least since it introduces bad data to the tracing system and isn't an error that's obvious to the developer.

Is there a precedence for this behavior (a dummy service name default?). I checked other languages' jaeger libraries and none of them appear to insert dummy values for the service name (my search was not exhaustive).

The go library requires a service name be set. The java library requires a service name be set. The python library requires a service name be set.

This leads me to believe it's this library is setting a bad default, and the behavior should be removed in favor of an appropriate fatal error.