m-lab / ndt-server

docker native ndt5 and ndt7 server with prometheus integration
https://www.measurementlab.net/
Apache License 2.0
101 stars 40 forks source link

Test_DrainForeverButMeasureFor_NormalOperation failure #255

Open bassosimone opened 4 years ago

bassosimone commented 4 years ago
2020/02/11 21:04:42 c2s.go:165: Timed out
--- FAIL: Test_DrainForeverButMeasureFor_NormalOperation (0.11s)
    c2s_test.go:56: Expected positive byte count but got 0

See https://travis-ci.org/m-lab/ndt-server/builds/649118098#L3475

stephen-soltesz commented 4 years ago

Yeah, this happened to me yesterday too.

2020/02/24 22:29:52 c2s.go:165: Timed out
--- FAIL: Test_DrainForeverButMeasureFor_NormalOperation (0.14s)
    c2s_test.go:56: Expected positive byte count but got 0

FAIL
pboothe commented 4 years ago

Flaky test. :(

stephen-soltesz commented 4 years ago

This appears to be caused by a genuine race. The TCPInfo data returned contains non-zero metrics for other fields, but the test only checks BytesReceived, which happens to be zero after the provided timeouts expire (some of the time). This is why err is nil in these cases.

The web100 "measurer" polls every 100ms https://github.com/m-lab/ndt-server/blob/master/ndt5/web100/web100_linux.go#L55

The c2s "drain forever" unit tests variously provide timeouts of 100ms, 1sec, 1ms.

This bug is reporting the "NormalOperation" variant which uses 100ms. We may see this more often merely b/c it runs first.

Anecdotally, I've also seen an error from the "WsGoodput" variant which uses 1ms while working to migrate to the netx package.

stephen-soltesz commented 4 years ago

PR #297 includes improvements to the unit tests by adjusting some of these timeouts.

Originally, I could trigger the test error simply by running this loop for a few minutes:

while go test -v -count=5 -timeout 30s github.com/m-lab/ndt-server/ndt5/c2s ; do sleep .1 ; done

After adjusting the timeouts, it ran over night (~10hrs) without error. While this does not fundamentally fix the race, I think it improves the situation.