open-telemetry / opentelemetry-android

OpenTelemetry Tooling for Android
Apache License 2.0
114 stars 22 forks source link

java.net.ProtocolException: Unexpected tag 0 #457

Open wirthandras opened 1 week ago

wirthandras commented 1 week ago

Hi,

We are using version 0.6.0. And we got an exception:

FATAL EXCEPTION: pool-4-thread-1
Process: com.demo, PID: 4692
java.lang.IllegalArgumentException: java.net.ProtocolException: Unexpected tag 0                                                                                                    at io.opentelemetry.contrib.disk.buffering.internal.serialization.deserializers.SpanDataDeserializer.deserialize(SpanDataDeserializer.java:28)
at io.opentelemetry.contrib.disk.buffering.internal.exporter.FromDiskExporterImpl.lambda$exportStoredBatch$0$io-opentelemetry-contrib-disk-buffering-internal-exporter-FromDiskExporterImpl(FromDiskExporterImpl.java:59)                                                                                                       at io.opentelemetry.contrib.disk.buffering.internal.exporter.FromDiskExporterImpl$$ExternalSyntheticLambda0.apply(D8$$SyntheticClass:0)
at io.opentelemetry.contrib.disk.buffering.internal.storage.files.ReadableFile.readAndProcess(ReadableFile.java:99)
at io.opentelemetry.contrib.disk.buffering.internal.storage.Storage.readAndProcess(Storage.java:86)
 at io.opentelemetry.contrib.disk.buffering.internal.storage.Storage.readAndProcess(Storage.java:69)
at io.opentelemetry.contrib.disk.buffering.internal.exporter.FromDiskExporterImpl.exportStoredBatch(FromDiskExporterImpl.java:55)
at io.opentelemetry.contrib.disk.buffering.SpanFromDiskExporter.exportStoredBatch(SpanFromDiskExporter.java:38)
at io.opentelemetry.android.features.diskbuffering.SignalFromDiskExporter.exportBatchOfSpans(SignalFromDiskExporter.kt:37)
at io.opentelemetry.android.features.diskbuffering.SignalFromDiskExporter.exportBatchOfEach(SignalFromDiskExporter.kt:86)
at io.opentelemetry.android.features.diskbuffering.scheduler.DefaultExportScheduler.onRun(DefaultExportScheduler.kt:34)
at io.opentelemetry.android.internal.services.periodicwork.PeriodicRunnable.run(PeriodicRunnable.kt:23)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
at java.lang.Thread.run(Thread.java:764)
Caused by: java.net.ProtocolException: Unexpected tag 0
at com.squareup.wire.ProtoReader.nextTag(ProtoReader.kt:172)
at io.opentelemetry.diskbuffering.proto.trace.v1.Span$ProtoAdapter_Span.decode(Span.java:1424)
at io.opentelemetry.diskbuffering.proto.trace.v1.Span$ProtoAdapter_Span.decode(Span.java:1324)
at io.opentelemetry.diskbuffering.proto.trace.v1.ScopeSpans$ProtoAdapter_ScopeSpans.decode(ScopeSpans.java:216)
at io.opentelemetry.diskbuffering.proto.trace.v1.ScopeSpans$ProtoAdapter_ScopeSpans.decode(ScopeSpans.java:174)
at io.opentelemetry.diskbuffering.proto.trace.v1.ResourceSpans$ProtoAdapter_ResourceSpans.decode(ResourceSpans.java:217)
at io.opentelemetry.diskbuffering.proto.trace.v1.ResourceSpans$ProtoAdapter_ResourceSpans.decode(ResourceSpans.java:175)
at io.opentelemetry.diskbuffering.proto.trace.v1.TracesData$ProtoAdapter_TracesData.decode(TracesData.java:155)
at io.opentelemetry.diskbuffering.proto.trace.v1.TracesData$ProtoAdapter_TracesData.decode(TracesData.java:124)
at com.squareup.wire.ProtoAdapter.decode(ProtoAdapter.kt:457)
at com.squareup.wire.ProtoAdapter.decode(ProtoAdapter.kt:455)
at io.opentelemetry.contrib.disk.buffering.internal.serialization.deserializers.SpanDataDeserializer.deserialize(SpanDataDeserializer.java:26)

The related Opentelemtry intializations:

import io.opentelemetry.android.OpenTelemetryRum
import io.opentelemetry.android.OpenTelemetryRumBuilder
import io.opentelemetry.android.config.OtelRumConfig
import io.opentelemetry.android.features.diskbuffering.DiskBufferingConfiguration
import io.opentelemetry.api.common.AttributeKey.stringKey
import io.opentelemetry.api.common.Attributes
import io.opentelemetry.api.metrics.Meter
import io.opentelemetry.api.trace.Tracer
import io.opentelemetry.exporter.otlp.http.logs.OtlpHttpLogRecordExporter
import io.opentelemetry.exporter.otlp.http.metrics.OtlpHttpMetricExporter
import io.opentelemetry.exporter.otlp.http.trace.OtlpHttpSpanExporter
import io.opentelemetry.sdk.logs.SdkLoggerProviderBuilder
import io.opentelemetry.sdk.logs.export.BatchLogRecordProcessor
import io.opentelemetry.sdk.metrics.SdkMeterProviderBuilder
import io.opentelemetry.sdk.metrics.export.PeriodicMetricReader
import io.opentelemetry.sdk.trace.SdkTracerProviderBuilder
import io.opentelemetry.sdk.trace.export.BatchSpanProcessor

        val diskBufferingConfig =
            DiskBufferingConfiguration.builder()
                .setEnabled(true)
                .setMaxCacheSize(10_000_000)
                .build()
        val config =
            OtelRumConfig()
                .setDiskBufferingConfiguration(diskBufferingConfig)

        val spanExporter = OtlpHttpSpanExporter.builder()
            .setEndpoint(traceIngestUrl)
            .addHeader("Accept-Encoding", "identity")
            .build()

        val logRecordExporter = OtlpHttpLogRecordExporter.builder()
            .setEndpoint(logIngestUrl)
            .addHeader("Accept-Encoding", "identity")
            .build()

        val metricExporter = OtlpHttpMetricExporter.builder()
            .setEndpoint(metricsIngestUrl)
            .build()

        val loggerProviderCustomizer: BiFunction<SdkLoggerProviderBuilder, Application, SdkLoggerProviderBuilder> =
            BiFunction { sdkLoggerProviderBuilder, _ ->
                sdkLoggerProviderBuilder.addLogRecordProcessor(BatchLogRecordProcessor.builder(logRecordExporter).build())
            }

        val meterProviderCustomizer: BiFunction<SdkMeterProviderBuilder, Application, SdkMeterProviderBuilder> =
            BiFunction { sdkMeterProviderBuilder, _ ->
                sdkMeterProviderBuilder.registerMetricReader(
                    PeriodicMetricReader.builder(metricExporter).build()
                )
            }

        val otelRumBuilder: OpenTelemetryRumBuilder =
            OpenTelemetryRum.builder(this, config)
                .addSpanExporterCustomizer {
                    spanExporter
                }
                .addTracerProviderCustomizer { tracerProviderBuilder: SdkTracerProviderBuilder, app: Application? ->
                    val otlpExporter = OtlpHttpSpanExporter.builder()
                        .setEndpoint(traceIngestUrl)
                        .addHeader("Accept-Encoding", "identity")
                        .build()
                    val batchSpanProcessor = BatchSpanProcessor.builder(otlpExporter).build()
                    tracerProviderBuilder.addSpanProcessor(batchSpanProcessor)
                }
                .addLoggerProviderCustomizer(loggerProviderCustomizer)
                .addMeterProviderCustomizer(meterProviderCustomizer)

I located three files in cache folder so I attached them (I added manually the .txt extension, since the Github doesn't support the original format): 1719496845682.txt 1719496186206_b18ab147-65a4-4c1a-ab39-30ae9ed36001.tmp.txt 1719496186206.txt

The otel integration was worked without problem before, and the error occured in an android emulator.

What can I do? The App completly stopped.

(After cold boot of emulator device, the error has gone)

LikeTheSalad commented 5 days ago

Thank you for all the details. It seems like the file had a wrong data format somehow. Do you remember if your app was suddenly killed by the OS at some point before this issue happened? I'm asking because the only thing that comes to my mind would be that the writing process got abruptly interrupted.

What can I do? The App completly stopped.

(After cold boot of emulator device, the error has gone)

I think you've properly configured everything so this seems to be something we have to take care of in the diskbuffering lib, I'm currently working on a refactor so for the time being I'll leave this issue open to take a detailed look afterward. If you happen to see this issue again it'd help a lot if you could let us know just to get an idea of how often something like this can happen.

wirthandras commented 4 days ago

Hi, thanks for feedback!

I faced the issue only in emulator just only once, and the problem could be related to a really fast "Re-Run" of the application in Android Studio

esdrasdl commented 2 days ago

We are facing this issue as well. We do not set any config related to DiskBuffer. Is it feasible to disable disk buffering until the problem be fixed?

Captura de Tela 2024-07-04 às 13 59 23
LikeTheSalad commented 1 day ago

Thanks for letting us know @esdrasdl

Is it feasible to disable disk buffering until the problem be fixed?

Disk buffering is disabled by default so I'd recommend taking a look at the OTelRumConfig instance you're passing to the agent's initialization (OpenTelemetryRumBuilder) to make sure that you're not enabling it by mistake.