parallel-runtimes / lomp

Little OpenMP Library
Apache License 2.0
153 stars 17 forks source link

Initial fix for issue 6 (improved nonmonotonic scheduling). #58

Closed JimCownie closed 2 years ago

JimCownie commented 2 years ago

Improve the scheduling code for dealing with the "static steal" scheduling implementation (used for nonmonotonic:dynamic schedules). This change fixes two problems:

  1. The existing code failed to compile with g++ (it doesn't like a std::atomic data member inside an unnamed struct/union).
  2. Fix the incrementBase function to use a std::memory_model::seq_cst store when updating the base. This is needed to prevent the load of the end from floating above the store, and thus providing no ability to detect the race which it is used to detect.

With this fix the code passes test onX86_64 (where it previously failed), and one can see that the change to the requested memory model has affected the generated code (introducing the use of an xchg instruction [which is always atomic on X86] instead of a simple store). On AARCH64 there is no change, since it already used ldar and strl instructions which synchronize with each otehr,