chapel-lang / chapel

a Productive Parallel Programming Language
https://chapel-lang.org
Other
1.76k stars 414 forks source link

[Patterns] Should distributed maps support parallel updates to values? #18962

Open lydia-duncan opened 2 years ago

lydia-duncan commented 2 years ago

This issue is part of a series of issues to design the interface for collections across serial, parallel, and distributed contexts. Our goal is to determine what we think is reasonable to support and what doesn't seem useful. Additional patterns welcome (but it would be best to put them in their own issue for discussion, likely)

var m = new DistributedMap(...);

forall (k, v) in m {
  on k {
    v += k.locale; // Today, I think the interface requires `m.set(k, v + k.locale);`?
  }
}
mppf commented 2 years ago

Wouldn't the code in the OP have a race condition if adding to an integer, say? I suppose it comes down to the question of whether or not the parallel map should be helping to ensure atomicity. I think it should, mainly for efficiency reasons, but I could also argue the other viewpoint.

lydia-duncan commented 2 years ago

I also think the parallel or distributed maps should help to ensure atomicity on a per-key basis. I do think that other iterations from the same forall shouldn't interfere with each other, though

vasslitvinov commented 2 years ago

Do non-distributed maps support parallel updates to values? If they do, then it makes a lot of sense for distributed maps to support them, too.

Wouldn't the code in the OP have a race condition if adding to an integer, say?

Intuitively, a forall (k, v) in m should not have race conditions among v across loop iterations, by analogy to a forall over array elements forall a in myArray.

lydia-duncan commented 2 years ago

Do non-distributed maps support parallel updates to values? If they do, then it makes a lot of sense for distributed maps to support them, too.

Part of the point of the overarching exercise is to consider how parallel maps would work as well. Maybe worth opening a similar issue?

vasslitvinov commented 2 years ago

consider how parallel maps would work

It makes sense to me to create an issue that says "parallel maps offer all features of serial maps plus X and Y and Z" and one with "distributed maps offer all features of parallel maps plus A B C". Have those issues discuss what those ABCs and XYZs are.

bradcray commented 2 years ago

Should distributed maps support parallel updates to values?

My intuition is that there would need to be since, if there were a serial access point to a distributed map, it would prevent scalability, which would be one of the major motivators of using a distributed map.

The bigger question is "what sorts of parallelism are supported?" I think Vass is right that the code in the OP doesn't have a race, and would guess that @mppf was thinking of a case more like:

forall keys in someArrayOfKeys do
  myMap(key) += 1;

where multiple keys could be trying to update the same slot simulteously.

Note that I don't think the on in the OP should be required. In the same way that a forall over a distributed array naturally distributes the iteration on a coarser grain than "every iteration", I'd expect a forall over a distributed map to do the same.

mppf commented 2 years ago

would guess that @mppf was thinking of a case more like:

Yes, that's right. Today we have some pretty worrying race conditions with that pattern even if the elements are atomic ints.