open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
3.05k stars 2.35k forks source link

[test][exporter/prometheus] TestPrometheusExporter failures #36139

Open pjanotti opened 2 hours ago

pjanotti commented 2 hours ago

Component(s)

exporter/prometheus

Describe the issue you're reporting

A few instances of failures:

=== FAIL: . TestPrometheusExporter (0.00s)
    prometheus_test.go:87: 
            Error Trace:    D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/exporter/prometheusexporter/prometheus_test.go:87
            Error:          Received unexpected error:
                            close tcp 127.0.0.1:8999: use of closed network connection
            Test:           TestPrometheusExporter

=== FAIL: . TestPrometheusExporter (re-run 1) (0.01s)
    prometheus_test.go:87: 
            Error Trace:    D:/a/opentelemetry-collector-contrib/opentelemetry-collector-contrib/exporter/prometheusexporter/prometheus_test.go:87
            Error:          Received unexpected error:
                            close tcp 127.0.0.1:8999: use of closed network connection
            Test:           TestPrometheusExporter

DONE 2 runs, 73 tests, 2 failures in 361.711s

In quick glance this seems likely because the server using the port is actually launched in a goroutine and the port can be closed by shutdown before the server starts.

github-actions[bot] commented 2 hours ago

Pinging code owners:

dashpole commented 2 hours ago

@Argannor this seems likely. to be related to https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/35465.

I'm OOO for the next few weeks. Are you able to investigate + fix or alternatively revert your PR?

Thanks!

Argannor commented 2 hours ago

@dashpole I can have a closer look on monday and come up with a new PR. If that's not fast enough for you feel free to revert and I'll open a new PR reintroducing the changes + fix.

In quick glance this seems likely because the server using the port is actually launched in a goroutine and the port can be closed by shutdown before the server starts.

I think that's a likely explanation. I'll do some experimentation to confirm your hypothesis. If I'm not mistaken the error only occurs sporadically, so it's likely to be a race condition.

My apologies.

dashpole commented 1 hour ago

Monday should be fine. I won't be able to review for a while, but other approvers are free to do so.