charmplusplus / charm

The Charm++ parallel programming system. Visit https://charmplusplus.org/ for more information.
Apache License 2.0
203 stars 49 forks source link

nocopy accelerated section multicast #1974

Open ericjbohm opened 6 years ago

ericjbohm commented 6 years ago

Original issue: https://charm.cs.illinois.edu/redmine/issues/1974


It should be possible to reduce the number of copies required to implement a section multicast.

Especially if the entry method is receiving const data (i.e., readonly data). Ideally there should be only one copy per address space with refcounting to resolve cleanup when delivery has been completed at all leaves within the address space.

The current implementation creates one per PE, even for [readonly], and doesn't use RDMA to minimize pack/unpack cost.

nitbhat commented 5 years ago

Original date: 2019-03-14 19:47:35


From Core Meeting on 14th March: Raghavendra, Nitin, and Juan should discuss this feature and estimate the effort involved.