relab / hotstuff

MIT License
166 stars 52 forks source link

eventloop: TestHandler is flaky; possible deadlock #120

Closed meling closed 4 months ago

meling commented 4 months ago
$ go test -run TestHandler -count=20000
--- FAIL: TestHandler (1.00s)
    eventloop_test.go:31: timed out
--- FAIL: TestHandler (1.00s)
    eventloop_test.go:31: timed out
--- FAIL: TestHandler (1.00s)
    eventloop_test.go:31: timed out
--- FAIL: TestHandler (1.00s)
    eventloop_test.go:31: timed out
FAIL
exit status 1
FAIL    github.com/relab/hotstuff/eventloop 4.776s

Sometimes tests pass just fine (and finishes quickly), but running many times you should be able to reproduce. It does not help to increase the context's timeout, so I suspect this is a deadlock because non-failing test executions finish fast.

I discovered this when upgrading the dependencies in PR #118, which itself doesn't touch the event loop. However, it reproduces also in master per commit 6c1fcb7b4b413.

meling commented 4 months ago

Adding the following line:

    go el.Run(ctx)
+   time.Sleep(1 * time.Millisecond) // wait for the event loop to start

Seems to resolve the problem:

$ go test -run TestHandler -count=200000
PASS
ok      github.com/relab/hotstuff/eventloop 237.785s