@youd3 brought an imbalance in the array partitioning to my attention. In its current implementation, if P processes are storing P+1 items in a ygm::container::array, the first approximately P/2 ranks will each get 2 items, leaving almost half of the ranks completely empty. This occurs because we determine a global block size of ceil(k/P) for storing k items on P processes, and we assign full blocks until we have run out of data.
This imbalance becomes less severe as the average number of items per rank increases. I believe the fraction of completely empty ranks is bounded by 1 - floor(k/P)/ceil(k/P) in the worst case. This is enough to realistically leave entire compute nodes empty at large scales when k/P is relatively small (measured in hundreds or possibly thousands).
The fix is relatively simple. It just requires two different chunk sizes to be used for the data, and a little bit more math to determine which rank owns a particular index. To maintain consistency, the bag.rebalance() method needs to be updated when array partitioning is changed.
@youd3 brought an imbalance in the array partitioning to my attention. In its current implementation, if
P
processes are storingP+1
items in aygm::container::array
, the first approximatelyP/2
ranks will each get 2 items, leaving almost half of the ranks completely empty. This occurs because we determine a global block size ofceil(k/P)
for storingk
items onP
processes, and we assign full blocks until we have run out of data.This imbalance becomes less severe as the average number of items per rank increases. I believe the fraction of completely empty ranks is bounded by
1 - floor(k/P)/ceil(k/P)
in the worst case. This is enough to realistically leave entire compute nodes empty at large scales whenk/P
is relatively small (measured in hundreds or possibly thousands).The fix is relatively simple. It just requires two different chunk sizes to be used for the data, and a little bit more math to determine which rank owns a particular index. To maintain consistency, the
bag.rebalance()
method needs to be updated when array partitioning is changed.