parallel-runtimes / lomp

Little OpenMP Library
Apache License 2.0
153 stars 17 forks source link

Atomic operations: implementation may need more memory fences #40

Open JimCownie opened 2 years ago

JimCownie commented 2 years ago

Describe the bug OpenMP 5.1 has added a memory_orderclause to the atomic construct. That may require additional load/store fencing that our implementation does not currently perform. In particular, our implementation of max/min atomics will not provide any memory fencing in the case where the current thread's contribution is the wrong side of the target of the atomic (since, in that case, it does not need to write to the atomic target). However, it is not yet clear how compilers will pass the information from the extra consistency operation into the runtime, and the default is that the memory order semantics are "relaxed", so this may not be a huge problem.

Looking at compiler explorer it seems that the LLVM is calling __kmpc_flush after the atomic operation in the seq_cst case, so as long as we implement that correctly, we may be fine, though (at least clang) doesn't yet seem to support the compare clause, which may change things.

In any case, we should check...

To Reproduce Please provide steps to reproduce the behavior:

  1. Read the section of the standard at https://www.openmp.org/spec-html/5.1/openmpsu105.html#x138-1480002.19.7
  2. Worry

Expected behavior The runtime should implement the requested behaviour, once the compiler passes the information through to us.