Open streichler opened 5 years ago
The (re)discovery of arriver locations will be of interest for Realm collective operations as well.
@apryakhin has been working on this:
https://gitlab.com/StanfordLegion/legion/-/merge_requests/971
https://gitlab.com/StanfordLegion/legion/-/merge_requests/956
We are getting close to finishing this work. The prototype branch removes all non-scalable communication and preliminary results look relatively "promising". There is a large number of changes to the barrier's state machine that went in and we would probably need some help from the community @elliottslaughter @lightsighter @eddy16112 to test the feature and make it production ready. We are obviously won't be making it for 24.09 but I think 24.11 is a reasonable milestone and I'd rather think we could do the final merge somewhere in the October time-frame.
@apryakhin When you're ready, please link the specific MR you want folks to test, and will ping them (probably after 24.09 is released).
The long-dormant barriers branch has some changes that make barriers scalable (i.e. O(log N) work on each rank instead of O(N) work on one rank) and improve the migration capabilities, but additional work is still needed to handle cases where the arrivers' locations differs from what is expected.