go test -timeout 30s -run ^TestMeterConcurrentSafe$ go.opentelemetry.io/otel/internal/global -count 1000
Expected behavior
No Panic
Proposal
Fix the panic by cleaning the map but not releasing the map as a quick fix. #5758
This panic behavior shows the possibility that the SetMeterProvider would miss the delegation setting for instruments and registry. We also need to fix this. The simple solution could be to use mutex to lock the whole instrumentation creation. #5780
Concurrent safe tests didn't serve the purpose well due to the lack of runs (we only ran it once in CI). We can run the tests for 100 times. #5759
Description
This bug was found in CI after running
TestMeterConcurrentSafe
test and was introduced in #5754, where creating instruments while callingsetDelegate
method.This is because creating instruments can be locked in this line:
https://github.com/open-telemetry/opentelemetry-go/blob/e47618fc36af51d17ecdcc7299bbf706397e1cb1/internal/global/meter.go#L350
While
setDelegate
is cleaninginstruments
map:https://github.com/open-telemetry/opentelemetry-go/blob/e47618fc36af51d17ecdcc7299bbf706397e1cb1/internal/global/meter.go#L144
Then, the remaining code in the instrument creation method would trigger panic after the mutex is unlocked:
https://github.com/open-telemetry/opentelemetry-go/blob/e47618fc36af51d17ecdcc7299bbf706397e1cb1/internal/global/meter.go#L360
Steps To Reproduce
Expected behavior
No Panic
Proposal
SetMeterProvider
would miss the delegation setting for instruments and registry. We also need to fix this. The simple solution could be to use mutex to lock the whole instrumentation creation. #5780