Open ben-albrecht opened 5 years ago
This is up for grabs as a GSoC 2019 project. Some suggestions to GSoC applicants:
start_test
) and chpldoc
.Though not explicitly listed, this may be considered part of https://github.com/chapel-lang/chapel/issues/6329
I want to take this up as a GSoC 2019 project. I was just wondering if beginning with implementing the repeat
and product
itertools of python would be a good way to start. As far as I know, these tools are not available in Chapel as of now. Please let me know if I missed something, and whether to go ahead with this.
For example, the basic implementation of product for integer ranges could be given as:
iter product(x: range, y: range) {
for i in x do for j in y do yield (i, j);
}
writeln(product(1..5, 3..5));
// (1, 3) (1, 4) (1, 5) (2, 3) (2, 4) (2, 5) (3, 3) (3, 4) (3, 5) (4, 3) (4, 4) (4, 5) (5, 3) (5, 4) (5, 5)
So what I need to do is write the tests and documentation, work on determining whether it can be made parallelizable or not, and make this process fast by reducing communication overheads, right?
Please let me know if I'm missing something. Thank You!
Hi @akshansh2000, repeat
and product
seem like good starts. I expect repeat
to be relatively simple, but product
will be a little more challenging.
Regarding your product
implementation, there are a few design goals to consider:
Beyond that, documentation, evaluating performance, and determining if the implementation should/can be parallelized would be your next steps (in any order you prefer).
Thank you, @ben-albrecht. I'm working on it. I'll keep you updated.
The basic implementation for the repeat tool is given:
iter repeat (arg, times = 0) {
if times == 0 then
for count in 1.. do yield arg;
else
for count in 1..#times do yield arg;
}
// For integer/real/complex etc..
writeln(repeat(10, 15));
// For arrays/strings
writeln(repeat([1, 4, 6, 7], 4));
writeln(repeat("Testing", 5));
// For dictionaries
var dictDomain: domain(string);
var dict: [dictDomain] int;
dict["one"] = 1;
dict["two"] = 2;
dict["three"] = 3;
writeln(repeat((dictDomain, dict), 3));
// For infinite repetition
write(repeat("infinite"));
Output:
10 10 10 10 10 10 10 10 10 10 10 10 10 10 10
1 4 6 7 1 4 6 7 1 4 6 7 1 4 6 7
Testing Testing Testing Testing Testing
({two, three, one}, 2 3 1) ({two, three, one}, 2 3 1) ({two, three, one}, 2 3 1)
infinite infinite infinite infinite .....
Please let me know what all I need to change in this. This sure could be parallelized (only in the case of a bounded range), and I'll work on that next. The documentation and performace evaluation is further in the queue.
@akshansh2000 You may want to make this into its own Pull Request. (also side note: If you add write the code block as ```chpl you get syntax highlighting)
iter repeat (arg, times = 0) {
if times == 0 then
for count in 1.. do yield arg;
else
for count in 1..#times do yield arg;
}
You also will want a parallel iterator. I'll do this one for you so you get the gist of how to do so in Chapel...
iter repeat (arg, times = 0, param tag : iterKind) where tag == iterKind.standalone {
if times == 0 then
forall count in 1.. do yield arg;
else
forall count in 1..#times do yield arg;
}
You'll also definitely want a leader-follower version so you can do something like this...
forall (ix, a) in zip(repeat(fixedIdx), arr) {
a = ix;
}
In the above example, you zip over an infinite stream repeat(fixedIdx)
and some finite container.
Thanks for the clarification on the parallel iterator, @LouisJenkinsCS! I'll work on the leader-follower part and share my progress.
(Thanks for the syntax highlighting part as well hahaha)
Also, as for the PR, I'm supposed to make a Mason package and make a PR to the mason-registry repository, right?
Hm... I'm not sure, honestly. Previously students just submit a PR to the repository and it gets stuck somewhere (likely labeled as some kind of prototype/test), but I suppose it makes sense if @ben-albrecht would want you to implement it as a Mason package.
Alright, I'll wait for @ben-albrecht sir to reply, and work on my code until then.
Thank you!
Mason package vs package module is still an open question at this point. Though, I'm leaning towards a package module right now so that we can utilize the nightly performance testing. Once mason packages have this functionality, we could migrate this to a mason package, like we plan to do for other existing package modules (https://github.com/chapel-lang/chapel/issues/10713).
The repeat iterator is almost complete, I believe. Please take a look at my implementation and let me know if I need to change something.
Also, I'm a bit confused here. How do I implement infinite repetition for a parallel loop?
In the case of a forall
loop, it needs to know to the total number of tasks to divide the repeat
task into different tasks, while in the case of a coforall
loop, the program terminates at runtime due to excessive memory usage. Please help me out with this, I'm a bit lost.
Here's the code; I've commented the non working code below.
// Serial Iterator
iter repeat (arg, times = 0) {
if times == 0 then
for count in 1.. do yield arg;
else
for count in 1..#times do yield arg;
}
// Standalone Parallel Iterator
iter repeat (param tag : iterKind, arg, times = 0) where tag == iterKind.standalone {
if times == 0 then
coforall count in 1.. do yield arg; // parallel infinite loop creates problem
else
coforall count in 1..#times do yield arg;
}
// Procedure to compute chunks which are to be iterated through
proc computeChunk(r : range, myChunk, numChunks) where r.stridable == false {
const numElems = r.length;
const elemsPerChunk = numElems / numChunks;
const mylow = r.low + (elemsPerChunk * myChunk);
if (myChunk != numChunks - 1) then
return mylow..#elemsPerChunk;
else
return mylow..r.high;
}
const numTasks = here.maxTaskPar; // max number of tasks supported by locale
// Leader
iter repeat (param tag : iterKind, arg, times = 0) where tag == iterKind.leader {
if times == 0 then
coforall task_id in 0.. do yield(0..1,); // parallel infinite loop creates problem
else
coforall task_id in 0..#numTasks {
const working_iters = computeChunk(0..#times, task_id, numTasks);
yield(working_iters,);
}
}
// Follower
iter repeat (param tag : iterKind, arg, times = 0, followThis)
where tag == iterKind.follower && followThis.size == 1 {
const working_iters = followThis(1);
for idx in working_iters do yield arg;
}
for element in repeat(1, 10) do write(element, ' '); // works - 1 1 1 1 1 1 1 1 1 1
writeln();
forall element in repeat('abc', 4) do write(element, ' '); // works - abc abc abc abc
writeln();
forall element in repeat(123) do write(element, ' '); // doesn't work, parallel infinite loop
writeln();
var arr: [1..5] int;
forall (idx, element) in zip(repeat(3, 5), arr) do element = idx; // works
writeln(arr); // - 3 3 3 3 3
forall (idx, element) in zip(repeat(2), arr) do element = idx; // doesn't work, parallel infinite loop
writeln(arr);
Thank you!
Also, I'm a bit confused here. How do I implement infinite repetition for a parallel loop?
This is not supported today in Chapel. As a result, we will not be able to pursue parallel implementations of any infinite iterators for this project.
Alright, so should I put the rest of the code into a module and create a PR?
Oh yeah, I forgot that you can't break
from a forall
loop. I'm wondering if its wise to return a 'killswitch' which the leader and follower will check to see if it should continue.
Note that if I make the array the 'leader', it works fine. It has to do with the leader not knowing when to stop, which is definitely a problem. TIO
Note that if I make the array the 'leader', it works fine. It has to do with the leader not knowing when to stop, which is definitely a problem. TIO
Noted. So for now should I exclude the infinite-parallel iteration from the code and make a module out of the rest of my code, and throw an error when trying to use infinite iteration with a parallel loop?
Also, should I open an issue regarding the above?
You could halt when you have an infinite parallel loop for now, sure. The lack of a break statement in forall is a well-known issue, so no issue needed (unless Ben thinks differently)
You could halt when you have an infinite parallel loop for now, sure.
That sounds good to me, or just leave the parallel infinite iterator implementations out for now.
The lack of a break statement in forall is a well-known issue, so no issue needed (unless Ben thinks differently)
Despite being well-known, I think it's useful to have an issue to reference / search for. I've created a simple one here: https://github.com/chapel-lang/chapel/issues/12700
@ben-albrecht, @LouisJenkinsCS, I completed the serial version of the cycle itertool, which returns separate elements from an iterable, eg:
writeln(cycle('ABC'));
writeln(cycle(1..4, 2));
>> A B C A B C A B C ...
>> 1 2 3 4 1 2 3 4
I was working on the parallel version, but I am not sure if it makes sense to parallelize. I don't think that a shuffling of order would be desirable.
Is there some case that I'm missing where it might be used?
I don't think a standalone parallel iterator makes sense for cycle
, but a leader/follower parallel iterator might be useful in cases where zipped order matters, e.g.
forall (dayOfMonth, dayOfWeek) in zip(1..30, cycle('MWTRFSU')) {
April[dayOfMonth] = dayOfWeek;
}
@ben-albrecht, @LouisJenkinsCS, I am having some problem implementing the parallel version of the cycle
tool.
The serial version is working fine. However, the parallel version throws a zippered iterations have non-equal lengths
error when the length of the iterable is more than 1. Here is the code for the leader-follower iterator:
use RangeChunk;
iter cycle(param tag: iterKind, arg, times = 0) throws
where tag == iterKind.leader {
var numTasks = if dataParTasksPerLocale > 0 then dataParTasksPerLocale else here.maxTaskPar;
if numTasks > times then numTasks = times;
if __primitive("method call resolves", arg, "these") { // to check if `arg` is iterable
coforall tid in 0..#numTasks {
const working_iters = chunk(0..#times, numTasks, tid);
yield(working_iters,);
}
} else
throw new owned IllegalArgumentError(
"non-iterable type argument passed");
}
iter cycle(param tag: iterKind, arg, times = 0, followThis)
where tag == iterKind.follower && followThis.size == 1 {
const working_iters = followThis(1);
for working_iters do
for element in arg do
yield element;
}
On running the following code,
forall (el, id) in zip(cycle('ABCD', 2), 1..#8) do writeln(el, ' ', id);
the error for unequal lengths as described above is thrown. However, on running
forall (el, id) in zip(cycle(['ABCD'], 8), 1..#8) do writeln(el, ' ', id);
the program works fine.
What could I be missing? Thanks.
@akshansh2000
['ABCD']
creates an array literal (I.E if you do writeln(arg.type : string)
you'll see this fact), while 'ABCD'
creates a string. When you do ['ABCD', 'ABCD', 'ABCD']
you get the same error; the issue has to do with the fact that you are yielding more than requested.
cycle('ABCD', 2)
should print out 'AB'
, while ['ABCD']
should print out 'ABCD', 'ABCD'
. To do this, you'll want to ensure that you short-circuit the computations. I.E insert a break
inside of that follower sequential for
loop when you hit the upper limit early.
@LouisJenkinsCS
Actually, cycle('ABCD', 2)
should print A B C D A B C D
, and that's why in the first example I set the times
argument to 2, and let id
iterate from 1..8
, due to the string length being 4.
One cycle
iteration means going through the whole iterable exactly once. Sorry for the missing explanation.
Similarly, in the second example, as the iterable length is 1, repeating it 8 times would require id
to range from 1..8
use RangeChunk;
iter cycle(arg, times = 0, param tag: iterKind) throws
where tag == iterKind.leader {
writeln(arg.type : string);
var numTasks = if dataParTasksPerLocale > 0 then dataParTasksPerLocale else here.maxTaskPar;
if numTasks > times then numTasks = times;
if __primitive("method call resolves", arg, "these") { // to check if `arg` is iterable
coforall tid in 0..#numTasks {
const working_iters = chunk(0..#times, numTasks, tid);
yield(working_iters,);
}
} else
throw new owned IllegalArgumentError(
"non-iterable type argument passed");
}
iter cycle(arg, times = 0, followThis, param tag: iterKind)
where tag == iterKind.follower && followThis.size == 1 {
const working_iters = followThis(1);
writeln(working_iters);
var upperBound = working_iters.size;
var _times = 0;
label outerLoop
while true do
for element in arg {
_times += 1;
yield element;
if _times == upperBound then break outerLoop;
}
}
iter cycle(arg, times = 0) {}
forall (el, id) in zip(cycle('ABCD', 2), 1..#8) do writeln(el, ' ', id);
If you want it to go around in a cycle, it'd have to be a bit more complicated than what we have right now. If this is what you want you should obtain the size
of the arg
(assert that it is resolvable just like how you did for these
), and then have some additional logic to skip the first N iterations over arg
. RIght now it prints 'A' 'A'
because both begin at the same letter. Then again, you could make this more efficient by only creating tasks if the number of times
exceeds arg.size
.
I think I misunderstood the intention here. A string by itself is iterable and yields characters; if you want the string itself to be iterated over, the array literal is the best approach.
@LouisJenkinsCS, does that mean storing the string as an array of characters?
TIO That should work
That is a great approach, @LouisJenkinsCS.
I'll try to optimize it and see if we can get it done using even fewer resources. I'll get back here if I have any doubts. Thank you!
TIO That should work
Nice example @LouisJenkinsCS.
I have a small suggestion - Reflection.canResolveMethod
would be preferred over the __primitive
call, which is not a public interface (and is therefore subject to change in the future without proper deprecation).
e.g.
use Reflection only;
...
if Reflection.canResolveMethod(arg, "these") {
...
} else {
throw new owned IllegalArgumentError(...);
}
@LouisJenkinsCS @ben-albrecht
TIO That should work
This code doesn't work if the times
argument is more than 2. TIO1, TIO2
I was trying to debug it.
I took another example of a simple iterator that yields 1
eight times. The serial part of it (TIO3) is given as
iter abc() {
for 1..8 do
for j in 1..1 do
yield j;
}
I then switched to a leader-follower one, but that didn't work quite well (TIO4). I set the default value of numTasks
to 4 so that the range gets divided evenly.
idx
starts from 2
instead of 1
halt reached - invoking orderToIndex on an integer 8 that is larger than the range's number of indices 8
This leader-follower one should yield 1
eight times, but it fails and gives the error as mentioned in TIO4. I can't understand why this happened. Also, to check if the range was getting divided evenly and the number of yielded 1
s were eight, I wrote the following code (without an upper bound in the follower of the forall
loop): TIO5.
1
s are 8, and so I can't understand what caused a problem in TIO4idx
still starts from 2
for some reasonAnd I guess the initial error which I got a few days back (zippered iterations have non-equal lengths
), TIO6, was related to this as well?
Am I doing something wrong here?
Most of our standard leader-follower iterators yield a tuple of indices using 0-based indexing. This is simply a convention so that if you're zippering something that's 0..7 with something that's 1..8 with something that's 1..16 by 2, they all have a common basis for what global iteration you're on. This means that an 8-element range's follower is expecting to receive indices in the 0..7 range. I expect that things are going wrong because your leader is yielding indices in the 1..8 range (but am surprised that the range iterator doesn't complain more about being given an iteration that's OOB... that seems like a bug). Specifically, if I make your leader return working_iters-1, it seems to work as expected: TIO
@bradcray, yes, I should've taken care of that. I understand now. Thank you for the explanation!
One more thing, though. Why does an unbounded range starting from 1
not cause an error? In this TIO, the yielded 1
s are eight (which shouldn't be as per my understanding), and the indexing also starts from 2
. Why does this happen?
It would've been much easier to take care of if the range follower had complained at you that you were requesting unreasonable indices... It'd be good for one of us to look into why that was.
It'd be good for one of us to look into why that was.
I just did this and only now realized that it was complaining at you, athough the error message could've been clearer:
error: halt reached - invoking orderToIndex on an integer 8 that is larger than the range's number of indices 8
I missed it the first time because I forgot that TIO separates stdout from stderr. Running it from my console made it much clearer...
I think that's as expected. Unbounded ranges are treated a bit specially in that they are permitted to be zippered with anything without running into a length check validation failure. So since TIO5 wasn't 0-basing its indices, it yielded its second through #8'th element (so 2..9).
[edit: There's a long-term intention to give users the ability to write their own unbounded iterators that can similarly conform to the leader's size, but we haven't ever completed the design and implementation of that feature].
I'd love to help with it in any way possible!
@akshansh2000: I wish I knew how to advise you to do so. It's a pretty big design challenge and likely to end up with a very different world than the one we have today. The general ideas we've had (that I'm aware of) are based on increasing the amount of interaction between a leader iterator and the follower iterators such that it's up to a follower to say "mismatched zipper iteration" or just let the follower consume as much of the leader as it wants. The thinking is that this interaction would also help fix a longstanding bug in which expressions like forall (i,j) in zip(1..3, 1..4)
don't currently result in size mismatch errors. (I took a stab at this latter bug in https://github.com/chapel-lang/chapel/issues/11685 and there may be some other helpful notes there as well... or not). Again: big design challenge.
@bradcray, I guess I'd have to dig deep into the core concepts of Chapel for that, on it!
For anyone curious, the current progress of the itertools library can be found here: https://github.com/chapel-lang/chapel/tree/master/test/library/packages/Itertools
Much work remains to be done.
It would be useful to provide a toolkit of common serial and parallel iterators to users through a library. Python’s itertools library would make a good reference for target functionality as well as early performance comparisons. However, this task is not limited to iterators available in itertools.
Assuming we're porting python itertools, we would need to determine the following for each iterator:
There are a number of example recipes provided in the itertools documentation that demonstrate combining multiple iterators together to create powerful constructs. Being able to reproduce many (or all) of these constructs would be a good design goal.