open-telemetry / opentelemetry-java

OpenTelemetry Java SDK
https://opentelemetry.io
Apache License 2.0
1.97k stars 814 forks source link

ConcurrentModificationException in ConfigUtil #6732

Open neugartf opened 1 week ago

neugartf commented 1 week ago

Describe the bug There is a ConcurrentModificationException thrown while calling build() on LongGaugeBuilder.

Steps to reproduce not yet

What did you expect to see? No exception thrown

What did you see instead? ConcurrentModificationException

What version and what artifacts are you using? Artifacts: see *.gradle.kts excerpt below Version: 1.42.1 How did you reference these artifacts?

implementation(platform("io.opentelemetry:opentelemetry-bom:1.42.1"))
implementation("io.opentelemetry:opentelemetry-context")
implementation("io.opentelemetry:opentelemetry-api")
implementation("io.opentelemetry:opentelemetry-exporter-common")
implementation("io.opentelemetry:opentelemetry-exporter-otlp")
implementation("io.opentelemetry:opentelemetry-exporter-logging")
implementation("io.opentelemetry:opentelemetry-extension-kotlin")
implementation("io.opentelemetry:opentelemetry-api-incubator")
implementation("io.opentelemetry:opentelemetry-extension-kotlin")
implementation("io.opentelemetry:opentelemetry-sdk")

Environment Compiler: JDK 17 - Temurin OS: Ubuntu (unknown version) Runtime (if different from JDK above): Android runtime (ART) OS (if different from OS compiled on): Android

Additional context

  Caused by java.util.ConcurrentModificationException:
at java.util.Hashtable$Enumerator.next(Hashtable.java:1397)
at java.util.Spliterators$IteratorSpliterator.tryAdvance(Spliterators.java:1812)
at java.util.stream.ReferencePipeline.forEachWithCancel(ReferencePipeline.java:133)
at java.util.stream.AbstractPipeline.copyIntoWithCancel(AbstractPipeline.java:502)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:489)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:475)
at java.util.stream.FindOps$FindOp.evaluateSequential(FindOps.java:152)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:236)
at java.util.stream.ReferencePipeline.findFirst(ReferencePipeline.java:482)
at io.opentelemetry.api.internal.ConfigUtil.getString(ConfigUtil.java:40)
at io.opentelemetry.sdk.metrics.internal.debug.DebugConfig.<clinit>(DebugConfig.java:25)
at io.opentelemetry.sdk.metrics.internal.debug.SourceInfo.fromCurrentStack(SourceInfo.java:45)
at io.opentelemetry.sdk.metrics.internal.descriptor.InstrumentDescriptor.<init>(InstrumentDescriptor.java:25)
at io.opentelemetry.sdk.metrics.internal.descriptor.AutoValue_InstrumentDescriptor.<init>(AutoValue_InstrumentDescriptor.java:28)
at io.opentelemetry.sdk.metrics.internal.descriptor.InstrumentDescriptor.create(InstrumentDescriptor.java:35)
at io.opentelemetry.sdk.metrics.InstrumentBuilder.newDescriptor(InstrumentBuilder.java:111)
at io.opentelemetry.sdk.metrics.InstrumentBuilder.buildSynchronousInstrument(InstrumentBuilder.java:78)
at io.opentelemetry.sdk.metrics.SdkLongGauge$SdkLongGaugeBuilder.build(SdkLongGauge.java:96)
at io.opentelemetry.sdk.metrics.SdkLongGauge$SdkLongGaugeBuilder.build(SdkLongGauge.java:58) 
harshitrjpt commented 1 week ago

ConfigUtil.getString() uses System.getProperties() and Properties is not thread safe. Is it possible that your application in some other thread at the same time may be changing System.Properties in any way ?

I may be wrong, but to me it seems that making java.util.Properties thread safe is the only way.

jack-berg commented 1 week ago

at io.opentelemetry.sdk.metrics.internal.debug.DebugConfig.(DebugConfig.java:25)

This is unusual because its in the static initialization block of DebugConfig, which is only called once at startup.

We could do something like new HashMap<>(System.getProperties()) to make a copy of the system properties, but even that iterates over the the entries of the properties and may be susceptible (somebody fact check me on this) to concurrent modification exceptions.

We could also wrap the iteration with in a try / catch.

We could also find a way to iterate over the keys in a way that avoid concurrent modification exceptions. According to the HashTable javadoc:

The Enumerations returned by Hashtable's keys and elements methods are not fail-fast.

This suggests that something like System.getProperties().keys() could be used in iteration in the ConfigUtil#getString method. Would want to study / test more to be sure.

harshitrjpt commented 1 week ago

@jack-berg how about using ConcurrentHashMap to store a copy of the properties for iteration?

also AFAIK static block runs once when the class is loaded. I suppose DebugConfig.isMetricsDebugEnabled() in SourceInfo.fromCurrentStack() is the first time DebugConfig is referenced and hence loaded at that time. Maybe if we reference it earlier, but i don't think that would solve the issue.

neugartf commented 4 days ago

Is it possible that your application in some other thread at the same time may be changing System.Properties in any way ?

@harshitrjpt : Theoretically no, but honestly Android might be doing some things here we don't have really insight about.

neugartf commented 4 days ago

Would also vouch for a copy

System.getProperties().entrySet().stream().toArray()...
// or
Set.copyOf(properties.entrySet())

That should make iterating safe but ofc we wouldn't know about any changes to the list. Don't know if we necessarily need to.

jack-berg commented 4 days ago

That should make iterating safe but ofc we wouldn't know about any changes to the list. Don't know if we necessarily need to.

It seems like we have no choice but to accept the possibility of changes between when we read the keyset and read the values.