Closed kellabyte closed 5 years ago
It's worse than that: there's similar assumptions in other locations in this code base. go test -bench=. -race reveals data races in cursor, reader, shared_disruptor, and shared_writer.
I started trying to fix them all, but it turned out to be a bit of a rabbit hole. I'm concerned that this pattern may lose most of its advantages due to the only guarantee the go compiler makes being the happens-before relationship within a goroutine.
I may try a smaller proof of concept which involves only the spsc case and see if I can recreate the memory barrier without resorting to CAS operations and satisfy the race detector.
What's the word on this?
We are not actively pursuing this project. We welcome pull requests.
@joliver Noted, thanks :)
Nevertheless, @dallbee, I'm unable to replicate the race condition you mention. Has this been fixed in the meantime or am I just not getting "lucky" with Go's race detector?
@lthibault I can no longer reproduce the behavior on go 1.11 against Linux 4.19.8-arch1-1-ARCH x86_64.
@kellabyte @dallbee I am interested in resurrecting this project and I'm comfortable with the pattern. The pain point that I'm running into is the "happens-before" guarantees between go-routines without using assembly-style memory fences. I'd love to figure out how to better work with lfence
and rfence
so that I don't create weird concurrency bugs.
@kellabyte I went back to basics for a bit and stripped out all the excess. I believe I've got it working with a single writer and sets of consumer groups.
A consumer group is a set of consumers that can work independently of each other at the same time.
When there are multiple groups of consumers, each set gates on the previous set.
By completely removing the SharedWriter
, I believe I have resolved all of the race conditions in the code. I'm stilling looking at ways to implement the SharedWriter
in a way that's compatible with the Go memory model and without me having to write CPU-specific assembly code.
For now, I'm going to consider this issue complete. Again, I'm more than happy to have a discussion about how to implement various structures to optimize the implementation further and to work out any kinks.
The cursor implementation in
cursor_amd64.go
does not contain any atomic operations that you see incursor_386.go
orcursor_ARM.go
. It's relying on atomic instruction guarantees that AMD64 CPU's provide. The problem is the Go compiler can apply optimizations that could make relying on this unsafe.With optimizations we have no guarantee that the value will be stored when we call
Cursor.Store()
or even thatCursor.Read()
is always reading the latest value.Cursor.Read()
may load it once in a register and never load memory again. The code might work now, but future improvements to thegc
may cause this code to breakThe atomic Store/Load operations guarantee the memory write / read operations are linearizable, allowing the value to be observed by concurrent goroutines.
Proposal
I think
cursor_AMD64.go
should include atomic operations. If I make thecursor_AMD64.go
the same as the others I see what I consider acceptable performance change for something that we can rely on being safe. I think for safety purposes atomic operations should be used.I'm would be happy to provide a PR with the changes.
Benchmarks
Benchmarks were run
10 times
and compared withbenchstat
.