habanero-rice / hclib

A C/C++ task-based programming model for shared memory and distributed parallel computing.
http://habanero-rice.github.io/hclib/
BSD 3-Clause "New" or "Revised" License
71 stars 35 forks source link

Refinement of atomic operations in HClib runtime #76

Open sbak5 opened 5 years ago

sbak5 commented 5 years ago

Hclib uses __sync builtins through wrappers. This is deprecated builtins in gcc.

I think C11 is enforced for HClib which means we can use c11 atomics instead for atomic operations.

Relaxed consistency through 'acquire' and 'release' provides more efficient synchronization of memory writes/reads. I can easily find that some parts of hclib runtime codes are not synchronized correctly with 'release' memory barrier. This can cause hang in weak consistency machines such as ARM/Power. It might be fine on TSO machines such as x86 but I guess it's better to use consistent relaxed consistency model across all of runtime codes for compatibility across common architectures.