Currently, AMPI implements all communicators as their own separate chare array instances, and all collectives on those communicators are implemented as chare array broadcasts/reductions.
Chare array collectives are sent to and processed by all PEs, regardless of whether or not any element lives on that PE or not.
CkMulticast is implemented with an explicit spanning tree across only the PEs that actually have section elements on them, and so do not create extra work for those PEs.
Original issue: https://charm.cs.illinois.edu/redmine/issues/1309
Currently, AMPI implements all communicators as their own separate chare array instances, and all collectives on those communicators are implemented as chare array broadcasts/reductions. Chare array collectives are sent to and processed by all PEs, regardless of whether or not any element lives on that PE or not. CkMulticast is implemented with an explicit spanning tree across only the PEs that actually have section elements on them, and so do not create extra work for those PEs.