Stackdriver / stackdriver-prometheus-sidecar

A sidecar for the Prometheus server that can send metrics to Stackdriver.
https://cloud.google.com/monitoring/kubernetes-engine/prometheus
Apache License 2.0
120 stars 43 forks source link

Flaky test: TestTailFuzz #67

Closed jkohen closed 3 weeks ago

jkohen commented 5 years ago

This test failed during presubmit, and it's unrelated to my PR (which was documentation only): https://travis-ci.com/Stackdriver/stackdriver-prometheus-sidecar/builds/94064905

@fabxc how can we investigate this?

fabxc commented 5 years ago

The tailer keeps retrying to read more data until its context get canceled. Currently the test waits a fixed amount of time until it cancels after writing the last record. My guess is that the tailer just hasn't read everything yet before the writing goroutine terminates it.

I tried to reproduce this by running the test at high parallelism to cause contention but could not reproduce it. Possibly the process gets completely stalled for longer on Travis. I'd probably just raise the timeout for now. While it's not a 'proper' fix, anything that coordinates explicitly between reader and writer would seem to kill the fuzz part of the test a bit.

jkohen commented 5 years ago

Reopening. I fixed one issue, but while working on that fix, I noticed another flake: panic: write /tmp/test_tail370907351/00000059: file already closed. I confirmed that issue existed before PR #138 .

qingling128 commented 3 weeks ago

Closing this issue as the component has been deprecated.