wg14-cplex / epp

Extensions for parallel programming
8 stars 2 forks source link

should the object declared by "_Capture" be const-qualified? #8

Open nelsc opened 8 years ago

BlaineGarst commented 8 years ago

The design seems to be "use _Capture" to snapshot (and possibly modify) a stack variable (the dominant case), and to simply reference the variable to get some live address-captured stack variable. Well, address-captured variables are a bad idea and should be captured as const, in which case its trivial to get a mutable version of such.

The simpler design is to eliminate _Capture and just go with const captured locals (and globals).

nelsc commented 8 years ago

just go with const captured locals (and globals).

Here you seem to be proposing that, within a spawn statement or parallel loop, every outer-scope name should be treated as if it were const-qualified. Whereas elsewhere you seem to propose that that should be true only for outer-scope names with automatic storage duration.

Have I misunderstood you? Or are you really not sure exactly what you want to propose?

BlaineGarst commented 8 years ago

The "(and globals)" was a mis-statement on my part.

I'm tempted to say "(and _Thread_local)" instead but I won't. But it might be worth discussing.

phalpern commented 8 years ago

Whatever we decide regarding implicit const qualification, I believe strongly that local variables, thread-local variables, and global variables should be treated identically by a _Capture clause. Anything else would seem to be obfuscation. In all cases, the captured variable is a local copy, and is no longer global or thread-local.

The reason for a _Capture clause is to make a local copy of an outer-scoped variable (or expression) BEFORE parallel execution of a new task begins. As such, it cannot be accomplished by just using captured locals, const qualified or not, as the lifetime of those captured variables may not be sufficiently long. Motivating example:

// Walk a list and call f() on the value of each element.
// Calls to f() can be done in parallel.
_Parallel_task _Block {
    while (p) {
        _Task_parallel _Spawn _Capture(p) { f(p->value); }
        p = p->next;
    }
}

Note that p might change before f() is run. Without the explicit _Capture, this would produce a race. The situation is the same whether or not p is const qualified.

BlaineGarst commented 8 years ago

So "p" is a global that is safely traversed by this loop.

I would suggest that

_Parallel_task _Block { while (p) { typeof(p) p_copy = p; _Task_parallel _Spawn { f(p_copy->value); } p = p->next; } }

is better since there is no _Capture needed! Note that in your example if someone forgets to do _Capture they are really hosed.

phalpern commented 8 years ago

Nope. p_copy can go out of scope and get overwritten before it is read within the spawned task. On Jan 11, 2016 1:00 PM, "BlaineGarst" notifications@github.com wrote:

So "p" is a global that is safely traversed by this loop.

I would suggest that

_Parallel_task _Block { while (p) { typeof(p) p_copy = p; _Task_parallel _Spawn { f(p_copy->value); } p = p->next; } }

is better since there is no _Capture needed! Note that in your example if someone forgets to do _Capture they are really hosed.

— Reply to this email directly or view it on GitHub https://github.com/wg14-cplex/epp/issues/8#issuecomment-170634573.[image: Web Bug from https://github.com/notifications/beacon/AA8FY5KLTR6YZ-QXiJkEeNcOxNFDq6alks5pY-SmgaJpZM4GwqFA.gif]

nelsc commented 8 years ago

And even if Blaine's example worked, there would still be code which could easily be forgotten, drastically hosing the programmer. His proposal didn't actually eliminate the need for a capture, it only changed the way it was spelled.

BlaineGarst commented 8 years ago

Hmm, I think we're speaking past each other.

In my view, the spawn "block" would be, in fact, a closure where p_copy would be const captured, so although p_copy might go out of scope it doesn't matter.

Having a uniform treatment of run-concurrently code - especially if/when we add closures - makes more sense than teaching N different rule sets for N constructs.

Clark, I don't understand your point. Can you give some examples? The whole premise that the coder has to guarantee race conflict free expressions hasn't been very viable over the decades, and I think we both agree that the programmer has to know what is going on. Not having Capture() at all because it is done automatically reduces the conceptual load on the programmer without introducing race issues.

phalpern commented 8 years ago

The _Copy_in construct in the WP mostly matches the firstprivate construct in OpenMP. A spawn block was never specified as being a closure, and certainly not a by-copy closure. Doing so significantly deviates from OpenMP practice. I am not convinced that it would be efficiencient for fine-grained parallel work, but I have no evidence one way or another. Note that, for consistency, this by-copy closure semantics would also need to apply to parallel loops. For such fine-grained parallelism, I don't know of any prior art for treating a task as a closure that copies its environment, so if we wish to pursue this idea, I think we are talking about a half-year delay (at least) in the TS, including a commitment to creating an implementation of the proposed semantic. Maybe that's the right thing to do.