Abstract out scheduling

ndmitchell commented 4 years ago

In order for @spall to integrate their work with a different scheduler, it's important to have more than one scheduler. I initially suggested a free monad, but I don't think that's the right abstraction anymore. Thinking about it:

The user issues commands. There are basically two things the user can do that impact the scheduler in any way - they can call parallel to run many commands in parallel, and call cmd to cause an action to run an action.
The scheduler can run at most N commands at a time. At each time step, the scheduler has to decide which cmd to run.
After a cmd completes, the scheduler is responsible for deciding if it got a hazard, or not, and what to do next.

So my questions are:

Should the scheduler have access to the parallel information? Or should these just look like commands and the scheduler is unaware of their topology of branching? @spall - do you think you can make use of that information?
Do we want to enable cancelling a command halfway through?
Do we ever want to store more than just "what happened last time" in the speculation?

I am thinking a "scheduler" should be abstract in what a command is, and roughly have the interface:

-- given c which is the abstract type of command
-- and s which is the abstract type of information stored in a scheduler
newScheduler :: [(c, RW)] -> s -- create based on what happened last time
scheduleDequeue :: s -> Maybe (c, s) -- what would the scheduler like to execute next
scheduleEnqueue :: c -> s -> s -- something has been demanded by the user
scheduleRetire :: c -> RW -> s -> Either FatalError s -- something finished executing

That doesn't let you cancel running jobs, and doesn't let you observe where the user requests parallelism. My guess is s would be a type class, so we would have multiple schedulers to choose from, and c would be polymorphic so that the scheduler can only use c as a key.

spall commented 4 years ago

Should the scheduler have access to the parallel information? Or should these just look like commands and the scheduler is unaware of their topology of branching? @spall - do you think you can make use of that information?

If I understand this correctly, I think that information could be used. I think more fine-grained cmd information is best for enabling more parallelism. I'm thinking of a situation where 10 cmds are running parallel and we have an additional cmd that is only dependent on 1 of those 10 cmds. If we view the 10 cmds as a single cmd then we have less freedom in scheduling. Am I understanding this correctly?

Do we want to enable cancelling a command halfway through?

How would a scheduler decide if a cmd needed to be cancelled before it completed?

ndmitchell commented 4 years ago

If we view the 10 cmds as a single cmd then we have less freedom in scheduling. Am I understanding this correctly?

My idea was there is the scheduler above, which really just puts things in the queue to be run. The queue of what is to be run would have to understand what blocked on what, and thus would automatically take care of running things when their dependencies had finished. What you can't do with this abstraction is decide to what to speculate based on the parallel fan out or similar.

How would a scheduler decide if a cmd needed to be cancelled before it completed?

You could imagine it speculatively runs a command, a real command comes in, and it decides that it would rather free up the thread to run the real command. If we can avoid cancellation it becomes a much simpler model.

spall commented 4 years ago

The queue of what is to be run would have to understand what blocked on what, and thus would automatically take care of running things when their dependencies had finished.

This sounds quite different than what we have now. Agree?

What you can't do with this abstraction is decide to what to speculate based on the parallel fan out or similar.

You mean the cmds running in parallel are viewed as one unit?

You could imagine it speculatively runs a command, a real command comes in, and it decides that it would rather free up the thread to run the real command. If we can avoid cancellation it becomes a much simpler model.

This is very interesting. The simple model without cancellation is the model we have now right? Or is the last sentence referring to something else?

ndmitchell commented 4 years ago

Agree it's very different - it's separating out execution from decision making.

You mean the cmds running in parallel are viewed as one unit?

Not really. I mean you get asked "here's a new command", you don't get told that these two were issued in parallel, although you do get the new command sooner than you would have otherwise. You can see the parallel threads hitting you, but you don't know which thread a command request came in.

The simple model without cancellation is the model we have now right?

We don't have cancellation now. This abstraction would let us plug in the logic more easily, in a way that could replicate what we have now.

ndmitchell commented 4 years ago

I made a start. I think the changes that need making more broadly are:

Run should stop using T, and move to a custom R data type.
Trace should record the start and stop time in this execution, meaning T is now unused.

Then it's much easier to detect conflicts with Trace alone.

ndmitchell commented 4 years ago

After a day of patching, I'm happy enough with the current structure that instead of putting the information outside the Server module, it's now reasonable to put it inside the Server. I suggest you put in your additional speculation state in S (if you need any) and rewrite nextSpeculate to do an if to try the alternative scheduling.

Does that seem reasonable?

spall commented 4 years ago

Yes, I can try that and if I run into major difficulties I'll document them here and we can deal with them later.

ndmitchell / rattle

Abstract out scheduling #17