open-telemetry / opentelemetry-android

OpenTelemetry Tooling for Android
Apache License 2.0
114 stars 22 forks source link

Issue with Android Instrumentation for OkHttp #433

Closed esdrasdl closed 3 weeks ago

esdrasdl commented 3 weeks ago

Hello. I'm having some issues to use Open telemetry in my android app. Because I want to track every request as a span, I followed the instructions here

I took a look at the code and I figured out I need to setup my OpenTelemetry instance at GlobalOpenTelemetry to make it works. Let me know if I miss understood it.

The first time I load the app, usually I have no problem. But If I kill the app and open it again, the SDK throws an exception.

Here is the exception.

java.lang.IllegalStateException: GlobalOpenTelemetry.set has already been called. GlobalOpenTelemetry.set must be called only once before any calls to GlobalOpenTelemetry.get. If you are 
using the OpenTelemetrySdk, use OpenTelemetrySdkBuilder.buildAndRegisterGlobal instead. Previous invocation set to cause of this exception.
14:12:35.926  W     at io.opentelemetry.api.GlobalOpenTelemetry.set(GlobalOpenTelemetry.java:107)
14:12:35.927  W     at io.opentelemetry.api.GlobalOpenTelemetry.set(GlobalOpenTelemetry.java:115)
14:12:35.927  W     at io.opentelemetry.api.GlobalOpenTelemetry.get(GlobalOpenTelemetry.java:85)
14:12:35.927  W     at io.opentelemetry.instrumentation.library.okhttp.v3_0.internal.OkHttp3Singletons.lambda$static$2(OkHttp3Singletons.java:43)
14:12:35.927  W     at io.opentelemetry.instrumentation.library.okhttp.v3_0.internal.OkHttp3Singletons$$ExternalSyntheticLambda2.get(D8$$SyntheticClass:0)
14:12:35.928  W     at io.opentelemetry.instrumentation.library.okhttp.v3_0.internal.CachedSupplier.get(CachedSupplier.java:31)
14:12:35.928  W     at io.opentelemetry.instrumentation.library.okhttp.v3_0.internal.OkHttp3Singletons.lambda$static$5(OkHttp3Singletons.java:86)
14:12:35.928  W     at io.opentelemetry.instrumentation.library.okhttp.v3_0.internal.OkHttp3Singletons$$ExternalSyntheticLambda5.get(D8$$SyntheticClass:0)
14:12:35.928  W     at io.opentelemetry.instrumentation.library.okhttp.v3_0.internal.CachedSupplier.get(CachedSupplier.java:31)
14:12:35.928  W     at io.opentelemetry.instrumentation.library.okhttp.v3_0.internal.LazyInterceptor.intercept(LazyInterceptor.java:27)
14:12:35.928  W     at io.opentelemetry.instrumentation.library.okhttp.v3_0.internal.OkHttp3Singletons.lambda$static$4(OkHttp3Singletons.java:79)
14:12:35.928  W     at io.opentelemetry.instrumentation.library.okhttp.v3_0.internal.OkHttp3Singletons$$ExternalSyntheticLambda4.intercept(D8$$SyntheticClass:0)
14:12:35.928  W     at io.opentelemetry.instrumentation.library.okhttp.v3_0.internal.OkHttp3Singletons.lambda$static$3(OkHttp3Singletons.java:72)
14:12:35.929  W     at io.opentelemetry.instrumentation.library.okhttp.v3_0.internal.OkHttp3Singletons$$ExternalSyntheticLambda3.intercept(D8$$SyntheticClass:0)

My code

fun init(application: Application, deviceInfo: String) {

        val config = createOtelRumConfig()
        val otlpExporter = createOtlpHttpSpanExporter(header = deviceInfo)
        val filteringSpanExporter = createFilterExporter(otlpExporter)

        val openTelemetryRumBuilder = OpenTelemetryRum.builder(application, config).apply {
            mergeResource(createResource())
            addTracerProviderCustomizer { tracerProviderBuilder: SdkTracerProviderBuilder, app: Application? ->
                tracerProviderBuilder.apply {
                    val batchSpanProcessor = BatchSpanProcessor.builder(filteringSpanExporter)
                        .build()
                    addSpanProcessor(batchSpanProcessor)
                }
            }
        }

        try {
            openTelemetryRumBuilder.build().apply {
                GlobalOpenTelemetry.set(openTelemetry)
            }
        } catch (e: Exception) {
            e.printStackTrace()
        }
    }

Looks like the GlobalOpenTelemetry.get() inside of OkHttp3Singletons is called before I setup my OpenTelemtry instance on Application class. It may happens because I initiated the koin (DI framework) before to fire the OpenTelemetry setup. I use Koin to create/provide instances of OkHttp clients and the OpenTelemetry I setup by hand in my Application class.

The GlobalOpenTelemetry.get sets OpenTelemetry.noop() and when I try to set the right instance it throws an exception and then I cant setup the OkHttp3Singletons again and I'm not able to retrieve the spans...

What I can do to handle it?

LikeTheSalad commented 3 weeks ago

Hi! OkHttp3Singletons takes care of calling GlobalOpenTelemetry.get() as you mentioned, although it does so lazily the first time the OkHttp interceptor is run. So it seems like somewhere in your application there's an HTTP request happening before the OTel RUM instance has finished initializing.

It should be ok to create your own instance of OkHttp beforehand, the problem comes if that instance (or any OkHttp instance for that matter) is used to make requests before OTel is done initializing. So I think one thing we could verify is where is your app making its first HTTP request (using OkHttp) and try to initialize OTel before that.

Another reason might be that GlobalOpenTelemetry.get() is called elsewhere though, probably not even related to OkHttp's interceptors, so it's a bit tricky to rely on this singleton because it might happen even in code that doesn't belong in your app (maybe some dependency calling it at some point), because of this, I think we should ideally avoid using GlobalOpenTelemetry altogether. We're currently working on an instrumentation API that will help with propagating the OpenTelemetry RUM instance to all instrumentations without having to rely on GlobalOpenTelemetry, I'm not sure how feasible it will be for this particular OkHttp instrumentation though, but I'm planning to take a look at it. For now, all I could recommend is to make sure that no HTTP requests are triggered before the OTel RUM instance is ready.

esdrasdl commented 3 weeks ago

I saw another way to do the instrumenting of OkHttp clients is to use OkHttpTelemetry but it requires to change the way we get the endpoints through retrofit builder. And I also need to do a lot of changes in the app side...

I will follow your recommendation and I will keep close to see new ways to do the instrumentation of OkHttp clients.

esdrasdl commented 3 weeks ago

Just an update. I've tried avoid any HTTP request be called before OTel RUM instance be ready but it didn't work. I've noticed before, there were some spans related with third-party libs that may use OkHttp lib... Maybe those libraries are messing something.... Is possible to automaticaly instrument only my OkHttp clients?

LikeTheSalad commented 3 weeks ago

I see, it seems like you might have some dependencies making HTTP requests before your app.onCreate runs which could definitely cause troubles as the code that adds the OTel interceptors is added into the OkHttpClient builder itself, affecting all OkHttpClient instances.

For now is not possible to make the automatic instrumentation cover only one OkHttp instance, and I'm not sure if said feature might be a good idea in case you wanted to get visibility on all the HTTP traffic that your app handles, though you can still implement it manually into your own OkHttp instance while turning off the automatic instrumentation, it does affect the way other parts of your app are setup though, as you've mentioned, but right now is the only option. In the meantime, I'm planning to find a way to prevent the automatic instrumentation using GlobalOpenTelemetry to avoid these kinds of issues in the future.