Open juanjgalvez opened 7 years ago
Original date: 2017-02-16 21:44:51
Core decided that a Node Group should be added on top of the current Group CkMulticastMgr
Original date: 2017-05-09 20:50:04
This won't be an API change, AFAICT, so it could be done in a patch release.
Original date: 2017-08-30 19:47:27
Any update on this?
Original date: 2017-10-11 20:23:18
Currently debugging this on Blue Waters.
Original date: 2017-10-11 20:39:48
This is crashing on BW with 64 nodes.
The dependency chain for building CkArray group is locMgr->mcastMgr->array. Apparently the crash is due to nodegroup dependencies not existing (are ignored). So, because mcastMgr is in the middle of dependency chain the end result is that there is NO dependency being enforced for creation.
Original date: 2017-10-30 14:44:20
Respecting the dependencies during creation seems to solve problems. Performance still needs to be tuned.
But nodegroup dependencies support does not exist yet in main charm branch, and merging a good solution will probably take some time.
Original issue: https://charm.cs.illinois.edu/redmine/issues/1394
Because CkMulticastMgr is a group, it uses a tree structure of PEs to send group messages. The problem is that if one of the PEs in the tree is busy with something, it won't process multicast messages that could be processed by other PEs in the same node.
Solution is to convert CkMulticastMgr to a nodegroup. Trees should be of logical nodes (processes) instead. Ideally, the spanning tree algorithm will also be physical-node aware when topology information is present.