Optimize number of atomics in error cases during SYNC_WAIT rearming

Original report by Aurelien Bouteiller (Bitbucket: abouteiller, GitHub: abouteiller).

While bugfixing some of the erroneous behavior in WAIT_SYNC we discussed (pr #18) the following optimization that could reduce the number of atomics in the error case.

Idea:

Rearm the sync -as-is- without first detaching all requests; still having spurious wakeups, but having less atomics when it happens.

This change entails that the sync_update(status=err) does not set sync->count to 0
, and that the sync remains in a ‘signaling’ state after it has been triggered in error so that we do not mistakenly erase it in error cases
Do We also need a way to update the count target in a safe way after the sync is attached to active reqs (which we do not have now)?
1. when a request has completed (in error or otherwise) the target count has been decreased on the sync by the sync_wait_updated (during or outside of the WAIT_SYNC period in the wait operation). If we do not reset the target count on error, the counting remains correct.
2. If we have a request in error, we need to complete the wait now; no need to rearm the sync in this case.
3. So it appears we don’t need to do this which saves us from the associated thread races.

‌

ulfm-devel / ompi

Optimize number of atomics in error cases during SYNC_WAIT rearming #53