Open anyzelman opened 9 months ago
Prioritising higher as the use in nonblocking/coordinates.hpp seems to indicate false sharing is possible
Prioritising higher as the use in nonblocking/coordinates.hpp seems to indicate false sharing is possible
more precisely, false sharing on local_prefix_sum
in the function prefixSumComputation()
E.g., in nonblocking/blas1.hpp:553 . It should instead rely on the internal global buffer for such things, as otherwise it clashes with performance semantics (which typically prefer no system calls may be made, including allocations) while the current solution is also not using NUMA-aware allocation (which the global buffer does use).