ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
32.62k stars 2.38k forks source link

Handling atomic fences under ThreadSanitizer #11650

Open kprotty opened 2 years ago

kprotty commented 2 years ago

So LLVM's ThreadSanitizer (TSan) is a tool for detecting race conditions. It detects races by rewriting memory operations to call their own functions for tracking purposes.

There's an issue in that TSan doesn't properly support atomic fences. To get around this, projects using fences have to either specialize around TSAN or move their algorithms to normal atomic operations (which can result in perf loss). In zig, we're currently doing the specialization approach but it begs to question whether this is the best way of going about it.

The issue extends past TSan for if (and when) Zig wants to provide its own tooling to detect race conditions. It may struggle a similar fate and that only means more specialization; which doesn't scale. I'm making this issue to discuss how to deal with this as, with stage2 and other backends approaching, it'll have to be dealt with eventually.

Extinguish

One option is to remove fences entirely. Fences are generally used to synchronize with atomic operations on other atomic variables without doing an operation themselves. There's an introspection problem here in that it doesn't communicate to the compiler exactly what atomic variables or operations its synchronizing with. If fence is removed, it means all synchronization relationships (at least through the language) for compiler tools become explicit.

Unfortunately, this probably wouldn't work. This would mean the language prevents you from generating optimal solutions compared to other languages like C, C++, and Rust (as fences are used for optimizaiton). Users would probably resort to using inline asm and basically specializing in reverse (if not TSan then asm fence). It's also mentioned that there could be algorithms in the wild which cannot be rewritten without fences (although I anecdotally haven't come across such).

Extend

Another option is to extend the @fence builtin to take an (optional) memory address to denote which atomic variable is synchronizes with. This solves the general issue where TSan barfs on fences, moves the specialization into the compiler, and keeps the fence builtin.

Downside is that we change (or remove in the case of extinguish) the fence's API which makes it differ from C11's. This difference would need to be documented but it's still something to document for people coming from other languages.

Embrace

Last option I can think of is to keep specializing under TSan for now. After all, it's the most common solution (used in Zig and Rust stdlib + other projects) and currently works with the most popular race detector. This is the base case and I wanted to reverse the order for the meme but it didn't flow right.


This topic came up from implementation concerns in std.atomic.Atomic where it was attempted to be removed. The driving reason was it containing code that should be in the compiler with secondary reasoning being that its multiple ways to do the same thing (stdlib vs builtin). Arguments against it generally fell into the "Atomic(T) highlights misuse bugs vs (T + builtin)" bucket and the "introduced useful operations (e.g. bitRmw) that aren't currently exposed by builtins" bucket.

Feel free to read up on comments in the PR linked above, but any more opinions are appreciated.

matu3ba commented 2 years ago

I dont think option "Extinguish" is viable, because the C11 memory model has several problems and is revisited see https://github.com/ziglang/zig/issues/6396#issuecomment-1108996193. A rewrite could not build on a "clear+futureproof design" and would rather be a TSAN port. Upstream LLVM TSAN (compiler-rt/lib/tsan), as of 094fb13b88b36ecfa475cb877d2c6e9d90b4d1a5, 20220511 has ~26024 LOC C++ code from which 2329 LOC are tests. (tokei -e tests compiler-rt/lib/tsan/ and tokei compiler-rt/lib/tsan/ installed with cargo install tokei)

For the same reason I suspect that extending things significantly might not be good as the research into models is very active (the linked paper in the comment describes it as very hard to follow due to activity). 1.Do you happen to know which paper is or are the reference ones that TSAN main developers are using as "intended semantics" or how they handle semantic problems?

  1. Could moving things into the compiler prevent use cases like user extensions similar to how Valgrind allows Kernel memory/object/state tracking? Or how do you estimate the risk here?
  2. How much runtime API surface for TSAN would be exposed to the user or will this only contain the compilation flag configurations?
kprotty commented 2 years ago

It's not immediately clear why the "Extinguish" option isn't viable and how the C11 memory ordering problems affect it. One can still have C11 semantics without fences - they can still exist in formal reasoning without being exposed or used at the implementation level. This does not imply a rewrite of TSan, nor is said rewrite (if it were to happen) need to built on a "clear+futureproof design" I don't think.

The "Extend" option shouldn't be significant: it only makes the synchronizes-with relationships for fences more explicit. It also shouldn't change any existing happens-before reasoning making it unrelated to continuing research on memory models. Anecdotally, fence synchronization hasn't really been "very hard to follow".

  1. No. I also don't think their implementation has to be backed by a paper to correctly implement C11 race detection.. If there are semantic mapping issues with TSan, those would be great to know though.
  2. I'm not sure what you refer to by "user extensions" w.r.t. moving TSan workarounds into the compiler
  3. None, just like at the moment. There could be std.tsan.* for utils like flushing TSan memory but those aren't currently used by the stdlib or compiler.
matu3ba commented 2 years ago

I'm not sure what you refer to by "user extensions" w.r.t. moving TSan workarounds into the compiler

https://valgrind.org/docs/manual/drd-manual.html section 8.2.5. Client Requests and https://valgrind.org/docs/manual/hg-manual.html shows a pile of macros, which TSan might choose to support in one or the other form.