Open LouisJenkinsCS opened 5 years ago
I'm having trouble understand what this issue is asking for. Could you show a short example?
use CyclicDist;
var cyclicDom = {1..10} dmapped Cyclic(startIdx=1);
var tidCounter : atomic int;
forall i in cyclicDom with (var tid = tidCounter.fetchAdd(1)) do writeln(tid, " has ", i);
Output
3 has 1
4 has 6
3 has 2
4 has 7
3 has 3
4 has 8
3 has 4
4 has 9
3 has 5
4 has 10
Expected Output
3 has 1
4 has 2
3 has 3
4 has 4
3 has 5
4 has 6
3 has 7
4 has 8
3 has 9
4 has 10
Note that right now, task 3 has indices 1..5, and task 4 has indices 6..10. This is what I meant by 'pseudo-block distribution'; I would want task 3 to have 1..10 by 2
and task 4 to have 1..10 by 2 align 1
. As in, cyclic distribution of ranges for tasks for a Cyclic
distribution.
It's not obvious to me that this is necessarily the right thing to do (which is not to say that a distribution author couldn't choose to do it). While it might help in some cases like the specific one that motivated you to post this issue, it could hurt in others, for example due to false sharing between tasks due to accessing elements in an interleaved manner rather than grabbing large chunks of contiguous items. It does suggest that a distribution's documentation should specify how local iteration is handled so that performance-minded programmers can reason about it.
Gotcha. Then I suppose I should ask this: If it is not necessarily ideal to have the distribution-specific iteration schemes (I.E Cyclic
performing cyclic iteration, Block
performing block iteration, BlockCyclic
performing block cyclic), would it be acceptable for me to propose that a specific iterator (I.E as a method, not necessarily as part of 'dynamic iterators') be added to the Chapel standard distributions? I.E, iter CyclicDom.cyclic()
, iter CyclicArr.cyclic()
, iter BlockDom.block()
, and iter BlockArr.block()
, etc. Preferably a name that was common among all distributions that just reflected the type of distribution used, maybe just chunks
?
use CyclicDist;
var cyclicDom = {1..10} dmapped Cyclic(startIdx=1);
var tidCounter : atomic int;
forall i in cyclicDom.chunks() with (var tid = tidCounter.fetchAdd(1)) do writeln(tid, " has ", i);
To demonstrate what I mean by example, think of a
Cyclic
distribution. If a domain is domain-mapped over aCyclic
distribution, should iteration over such a domain distribute tasks in a cyclic-fashion? As in, should it both A) distribute data in a cyclic manner such that each locale owns its own stride over the domain, and B) distribute computation in a cyclic manner such that tasks also own strides over their own local subdomain.This is related to #13305 in that I believe that this type of behavior should be supported by default. Right now, iterating over a
Cyclic
domain-mapped domain results in a pseudo-block distribution of work for tasks iterating over the local subdomain. I think that the tasks iterating over the local subdomain should follow the pattern of distribution they are named after, such as block (behavior we currently have right now)Block
, and block-cyclic forBlockCyclic
, and so on and so forth.