Idea: making non-IO schedulers efficient

iu-parfunc / verified-instances

Verified instances for parallel programming.

BSD 3-Clause "New" or "Revised" License

8 stars 5 forks source link

This was a thought experiment I dumped into Slack, but Slack makes it very easy to lose things, so I'm archiving it here:

July 10th

The thing that is getting me excited about it is closing the gap between a practical (real, physically parallel) scheduler, and this idealized one -- by changing the proof as little as possible, but changing the abstractions we implement parallel scheduling with instead.

I.e. the normal way we implement runPar is to give an interpretation to our logical threads as real IO threads.

That's gross because we don't have good reasoning tools for IO. (Albeit, Colin Gordon and Niki have had a start at this issue.)

My understanding is that it's only an early start -- we don't have a formalized memory model... we don't have something like an Iris program logic for reasoning about IO programs concurrently interacting with the heap.

So ... no IO. Forget IO for implementing schedulers. (edited)

Instead, I think we can build parallel schedulers using a combination of: (1) laziness (2) futures (3) linear types

We've got this fork of GHC with linear types (our other POPL submission) and it allows the magic of purely functional interfaces to efficient, in-place mutable data structures.

I'm attracted by the idea that we can combine Liquid + Linear Haskell trivially. Correct me if I'm wrong, but the linear-or-non-linear status of a binding shouldn't matter one bit for refinements, right? It's orthogonal information....

So here's a basic sketch of the idea: (1) @vikraman's current scheduler steps one (random) thread from the threadpool at each timestep, this new scheduler would do something similar, but it would do it by evaluating a random thunk in a sequence of thunks to WHNF. (2) The thunks represent schedulable computations, and they return a result value, coroutines style: Get iv k, Put iv val k, Done. Gets and Puts request modifications of the heap, and supply continuations. Pure computation is "instantaneous", and each thread computes atomically until its next ivar interaction. (We can trivially add Yield as well, if desired.) (3) In the initial version, all updates to the heap are done by the scheduler. The scheduler is a pure function, but it uses a linear API to the heap. Heap updates can be turned into actual imperative updates to a concurrent map (IvarID -> Value).

Ok, so yes, the mutable data with linear API would be part of the TCB. But the scheduler would not be!

July 14

Casting it as a deforestation problem: Let's say execute actually runs an action such as Put iv 3 in the scheduler. For example, execute = (\Put k v -> writeMap heap k v).

The scheduler normally reads and forces a (random) worker thread result (act, pool') = deque pool, and then calls execute act. But it's a normal program optimization to "push" execute inside the child thread, deforesting Put iv 3 with subsequent optimizations.

map execute [result1, result2, ...] == [execute result1, execute result2, ...]

The steps in order would be something like: (1) make the proof go through for the cartoon scheduler in LH (2) linear-types: deforest the entropy source (list of random numbers) (3) linear-types: use a concurrent map for the heap (4) trickiness: deforest the spine of the concurrent map to make it behave like references (5) perform some interesting program manipulation/deforestation to make the worker threads do the heap updates themselves, rather than the scheduler being a (sequential) bottleneck.

For the IO-free version I described. For @ranjitjhala's proposed version, maybe step (1) at least would be the same And subsequent steps would include developing some kind of RGRef, plus some kind of limited ST monad .... Clarification of (2): Current entropy: [Int] (or an explicit stream)

BUT if you rewrite that as an explicit stream / iterator RNG -> (Int, RNG) And you make that linear, it becomes efficient: RNG -o (Unrestricted Int, RNG) ("Unrestricted" = Bang, ! modality in linear logic)

This sounds awesome, how about you work it into the relevant part of 4?

On Sun, Nov 12, 2017 at 2:23 PM Ryan Newton notifications@github.com wrote:

This was a thought experiment I dumped into Slack, but Slack makes it very easy to lose things, so I'm archiving it here: July 10th

The thing that is getting me excited about it is closing the gap between a practical (real, physically parallel) scheduler, and this idealized one -- by changing the proof as little as possible, but changing the abstractions we implement parallel scheduling with instead.

I.e. the normal way we implement runPar is to give an interpretation to our logical threads as real IO threads.

That's gross because we don't have good reasoning tools for IO. (Albeit, Colin Gordon and Niki have had a start at this issue.)

My understanding is that it's only an early start -- we don't have a formalized memory model... we don't have something like an Iris program logic for reasoning about IO programs concurrently interacting with the heap.

So ... no IO. Forget IO for implementing schedulers. (edited)

Instead, I think we can build parallel schedulers using a combination of: (1) laziness (2) futures (3) linear types

We've got this fork of GHC with linear types (our other POPL submission) and it allows the magic of purely functional interfaces to efficient, in-place mutable data structures.

I'm attracted by the idea that we can combine Liquid + Linear Haskell trivially. Correct me if I'm wrong, but the linear-or-non-linear status of a binding shouldn't matter one bit for refinements, right? It's orthogonal information....

So here's a basic sketch of the idea: (1) @vikraman https://github.com/vikraman's current scheduler steps one (random) thread from the threadpool at each timestep, this new scheduler would do something similar, but it would do it by evaluating a random thunk in a sequence of thunks to WHNF. (2) The thunks represent schedulable computations, and they return a result value, coroutines style: Get iv k, Put iv val k, Done. Gets and Puts request modifications of the heap, and supply continuations. Pure computation is "instantaneous", and each thread computes atomically until its next ivar interaction. (We can trivially add Yield as well, if desired.) (3) In the initial version, all updates to the heap are done by the scheduler. The scheduler is a pure function, but it uses a linear API to the heap. Heap updates can be turned into actual imperative updates to a concurrent map (IvarID -> Value).

Ok, so yes, the mutable data with linear API would be part of the TCB. But the scheduler would not be! July 14

Casting it as a deforestation problem: Let's say execute actually runs an action such as Put iv 3 in the scheduler. For example, execute = (\Put k v -> writeMap heap k v).

The scheduler normally reads and forces a (random) worker thread result (act, pool') = deque pool, and then calls execute act. But it's a normal program optimization to "push" execute inside the child thread, deforesting Put iv 3 with subsequent optimizations.

map execute [result1, result2, ...] == [execute result1, execute result2, ...]

The steps in order would be something like: (1) make the proof go through for the cartoon scheduler in LH (2) linear-types: deforest the entropy source (list of random numbers) (3) linear-types: use a concurrent map for the heap (4) trickiness: deforest the spine of the concurrent map to make it behave like references (5) perform some interesting program manipulation/deforestation to make the worker threads do the heap updates themselves, rather than the scheduler being a (sequential) bottleneck.

For the IO-free version I described. For @ranjitjhala https://github.com/ranjitjhala's proposed version, maybe step (1) at least would be the same And subsequent steps would include developing some kind of RGRef, plus some kind of limited ST monad .... Clarification of (2): Current entropy: [Int] (or an explicit stream)

BUT if you rewrite that as an explicit stream / iterator RNG -> (Int, RNG) And you make that linear, it becomes efficient: RNG -o (Unrestricted Int, RNG) ("Unrestricted" = Bang, ! modality in linear logic)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/iu-parfunc/verified-instances/issues/9, or mute the thread https://github.com/notifications/unsubscribe-auth/ABkuODiM5CRfbeOPVp-sRbKHP0faeuCYks5s12_HgaJpZM4QbEfP .

iu-parfunc / verified-instances

Idea: making non-IO schedulers efficient #9

July 10th

July 14