golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.02k stars 17.54k forks source link

x/net/trace: data race when freeing traces #39790

Open jrockway opened 4 years ago

jrockway commented 4 years ago

This bug is on go version go1.14.4 linux/amd64 with x/net from master (techincally 627f9648deb96c27737b83199d44bb5c1010cbcf).

There is a race condition between event recycling and trace clearing when the trace has a recycler set with SetRecycler and the number of events in tr.events is <= 4 (len(tr.eventBuf)). When a trace is initialized, tr.events is set to tr.eventsBuf[:0]. In unref, the freeing logic starts a recycling function over each event in a separate goroutine by retaining a reference to the tr.events slice, and then calls freeTrace and then reset to write a zero value to each element of the eventsBuf array. When the number of events added to the trace fits in the space reserved by tr.eventsBuf, tr.events just aliases that buffer, and so the order of reads and writes is non-deterministic. The freeing could run first, and thus the recycler function sees the zero-valued events written by reset, or the recycling goroutine could run first, in which case the recycling performs as expected.

https://github.com/golang/net/pull/75 (https://go-review.googlesource.com/c/net/+/238302) adds a test and a fix for this. If you patch in only the test and run "go test -race ." in the trace directory, you'll see that the race detector detects this data race. My fix makes a copy of tr.events if the number of events is less than or equal to the length of tr.eventsBuf, so that the recycler can run at its leisure. Obviously this is not ideal, as the whole point of eventsBuf and the trace pool is to avoid allocations. However, I get the feeling that people don't actually use event recyclers, or they would have already noticed that events don't get recycled and filed a bug like this ;)

Other options that could be considered are doing the entire free asynchronously (letting the refcount hit 0, doing recycling and freeing in the background, and then pushing the trace into the pool that newTrace draws from), or just doing the recycling synchronously (which probably breaks existing code that did something silly like tr.SetRecycler(func (e interface{}) { ch <- e }); tr.Finish(); <-ch). Let me know which approach you prefer, and I'll update the PR accordingly.

ianlancetaylor commented 4 years ago

CC @pjweinb