p5h / p5summit-2019

Perl 5 Summit
0 stars 0 forks source link

multi processor support without me thinking about it #5

Open toddr opened 5 years ago

Leont commented 5 years ago

TBH, this sounds like wanting a unicorn

toddr commented 5 years ago

TBH, this sounds like wanting a unicorn

I'd settle for a Pony but yes. Your point is taken. This is what people often ask for though so I thought I'd put it down as a starting point even if we blow it off.

leonerd commented 5 years ago

It depends how you implement it. If you go down the "well most people want X" route, then most people want concurrent preëmptive multiple-threading on mutable shared state (i.e. what things like C++ and Java have trained people to want), and that is a race-condition nightmare that waits to happen before you even finish typing the sentence.

Perl's original way out of that was to do default-shared-nothing threading (i.e. the fork-like threads model) and that, while sucking terribly at CPU and memory performance, is at least safe from data-manipulation race conditions and badly-written code that wasn't expecting multiple threading, so it doesn't feel too bad. Shame about the performance though.

If there hadn't been a scheduling clash between LPW and P5summit, I was going to present a talk at LPW about the async/await syntax, and how it effectively provides a whitelisting approach to concurrency issues, as compared the blacklisting approach provided by other techniques like mutex locking. The overall thrust is that if all of your asynchronous concurrency is provided by futures and async/await syntax, and you use shared-nothing containers (i.e. as existing perl threads) for your actual multicore parallelism, then the logic is much much much harder to cause concurrency bugs like race-conditions in. This issue thread probably isn't the venue to explain the entire of that talk, but suffice it to say it's an issue I feel very strongly about and could expand on a great deal.

It can be summarized by: if core really wants some decent multi-core / async / concurrency / parallelism support that doesn't encourage data-race bugs, then it would be wise to look at futures and async/await and other similar features in many other current languages. Chasing the unicorn of "what 1980s programmers thought threads should look like" is not it.

tonycoz commented 5 years ago

Wouldn't this be more like grep, map, foreach, regexps(?) working in parallel?

This is what I tend to think of for "multi processor support without me thinking about it", it's something I've considered for Imager.

Of course, anything that tries to work in parallel for perl at this level runs into the same issues as 5.005 threads.

atoomic commented 5 years ago

this would be awesome indeed to have a flavor set of grep / map, ... working in parallel for you :-) with a very simple syntax

pgrep
pmap
pforeach
...
leonerd commented 5 years ago

map and grep are some good examples because in themselves, the body of the loop is likely to be a side-effect-free[*] pure function. The problem comes in what if that body actually causes side-effects though.

my $x = 1;
my @results = pmap { $x++ } 'a' .. 'z';

What on earth would - or even could - this yield? In regular map we know it must and can only ever yield the list 1 .. 26 because each invocation of the body is made in sequence, entirely, from beginning to end, non-concurrently. What would the result of running those in parallel even do?

Offhand, I can only think of one useful and sensible result, and that would be the list (1) x 26 because mutations of the shared concurrent state could not safely be made. The parallelism would have to be performed inside a throwaway virtual container of some kind, where any actual mutations get thrown away for this iteration and don't affect the parent. That gets really horrible as soon as IO is involved though. So what would we do there - ban any IO at all, and only allow pure computation? That doesn't sound useful for realworld cases.

Again, I really strongly advocate not trying to build, or encourage, any situation involving shared mutable state with concurrent actors. It is not a fun world to be in.

*: as much as can be said in perl

leonerd commented 5 years ago

Furthermore, in the case of map and grep (which is just a special form of map), there already exists the entire fmap family of functions in Future::Utils which lets people write concurrent async map-like behaviour with actual controlled concurrency of IO operations.

In the case of foreach, that already behaves exactly as expected with Future::AsyncAwait.

iabyn commented 5 years ago

A big issue in perl, and what really killed the 5.005 threading model, is that just about anything can modify an SV. It's reference count is up and down like a yoyo. Things like $x = 1; $y = "$foo$x" secretly behind the scenes upgrades $x's SV body from non-existent to XPVIV with a "1\0" string attached to it. Also, tying and overloading means that any rvalue access to a value can trigger almost anything happening.

Abigail commented 5 years ago

On Tue, Oct 15, 2019 at 03:25:47AM -0700, Paul Evans wrote:

Furthermore, in the case of map and grep (which is just a special form of map), there already exists the entire fmap family of functions in Future::Utils which lets people write concurrent async map-like behaviour with actual controlled concurrency of IO operations.

In the case of foreach, that already behaves exactly as expected with Future::AsyncAwait.

But does this make use of multiple processors/cores?

Abigail

Leont commented 5 years ago

It depends how you implement it. If you go down the "well most people want X" route, then most people want concurrent preëmptive multiple-threading on mutable shared state (i.e. what things like C++ and Java have trained people to want), and that is a race-condition nightmare that waits to happen before you even finish typing the sentence.

Agreed

Perl's original way out of that was to do default-shared-nothing threading (i.e. the fork-like threads model) and that, while sucking terribly at CPU and memory performance, is at least safe from data-manipulation race conditions and badly-written code that wasn't expecting multiple threading, so it doesn't feel too bad. Shame about the performance though.

Having a shared-nothing threading model makes sense given the primitives we have, but faking a shared memory model on top of it (threads::shared) was clearly a mistake. What we need is a threading model that is both compatible with the VM that we have, and that's usable for end-users.

If there hadn't been a scheduling clash between LPW and P5summit, I was going to present a talk at LPW about the async/await syntax, and how it effectively provides a whitelisting approach to concurrency issues, as compared the blacklisting approach provided by other techniques like mutex locking. The overall thrust is that if all of your asynchronous concurrency is provided by futures and async/await syntax, and you use shared-nothing containers (i.e. as existing perl threads) for your actual multicore parallelism, then the logic is much much much harder to cause concurrency bugs like race-conditions in. This issue thread probably isn't the venue to explain the entire of that talk, but suffice it to say it's an issue I feel very strongly about and could expand on a great deal.

async/await is invaluable for a lot of asynchronous code, but it doesn't really help with actual multi-processor support