Open U8NWXD opened 2 years ago
Hey @U8NWXD, this is a great proposal. The existing parallel implementation was really just the first step I had in mind and more an expedient for getting large(r) colonies, I'm glad you're revisiting this.
I think the key you mention is making a kind of "process protocol", what you are calling Process Commands. This leads into supporting a general rpc-like api for querying and interacting with processes which could be the basis for a distributed api where each process (or engine) can run on it's own machine, coordinating through this api.
I think the protocol (let's call it a protocol) would involve the functions you mention, and in addition some more members of the Process
interface, namely:
calculate_timestep
ports_schema
next_update
(currently the only one supported)end
(you called it halt, it is called end
in Engine
.... either way)inspect
(retrieve any relevant internal state report)Maybe others? I think supporting __init__
would be great too, but that seems more like an Engine
method which would have the ability to "allocate" new process instances.
I've already been plotting a bit along these lines so could be worthwhile discussing IRL. Or we can hash it all out here and in your other PR.
I have a rough implementation in #198 that passes our existing tests. I still need to do some more comprehensive testing though. I decided to support all of the existing Process
methods in the process commands protocol, which lets us make ParallelProcess
inherit from Process
. I think that will make the implementation in Engine
cleaner: there will be lots of Process
instances in the store hierarchy. Some of them will happen to be ParallelProcess
instances, but Engine
will, for the most part, be able to treat them as normal Process
objects
I like the proposed "Process Commands" -- agreed with Ryan that this will help beyond simplifying parallel processes in that it declares a general protocol. I also find invoke_process
confusing, and have to trace through the code every time.
It looks like you have worked through some of the listed downsides on the draft PR? ParallelProcess
is now inheriting from Process
? This seems correct.
That would be great to keep ParallelProcesses in the store hierarchy, and in Engine.processes
-- a nice simplification that will save a lot of confusion. You can remove Engine.parallel
without breaking any of the projects that I have helped with, which is most of them. So this won't be a problem.
So yes, this proposal looks good!
Current Approach
Message Passing
When a process has
Process.parallel
set to true, we launch a parallel OS process that runsvivarium.core.process._run_update()
.run_update()
contains a loop that repeatedly receives(interval, state)
tuples from a pipe, passes those arguments toProcess.next_update()
, and sends the result back through the pipe. To stop the parallel process, we passinterval=-1
.Tracking in Main OS Process
In the main OS process (which contains
Engine
), we storeParallelProcess
instances inEngine.parallel
. Then when we need an update from a process, we callEngine._invoke_process
, which does the following:ParallelProcess
computing the update and returns it.InvokeProcess
, which has an interface similar toParallelProcess
but computes the update immediately. Note that there's an extrainvoke_process
function that computes the process's update, but this extra level of indirection appears unnecessary.Problems
The way we currently track parallel processes has a number of downsides:
ParallelProcess
instances inEngine.parallel
, but we also store the originalProcess
instances in the store hierarchy (with a reference inself.processes
).Process
instances in the store hierarchy is confusing. A user can read out the internal state of those processes with no problem, but they're getting the state from a process that hasn't changed since the beginning of the simulation, which is not intuitive.Proposed Solution
invoke_process
function that doesn't appear to do anything.ParallelProcess
instances inself.parallel
, put them directly into the store hierarchy with references inself.processes
. Once a parallel process has been put on its own OS process, there should be no copies of it left in the main OS process.Systematize message-passing between the main and parallel OS processes as commands with arguments.
Process
will have arun_command()
method that accepts a command name (string) and a tuple of arguments. The_run_update()
function would handle the following commands:_halt
: Ignores arguments and shuts down the parallel process (equivalent of passinginterval=-1
currently)_next_update
: Takes(interval, state)
as arguments and passes them toProcess.next_update()
Process authors can override
run_command()
to handle more commands, e.g. to return internal state variables.ParallelProcess
will also providerun_command()
, but instead of running commands itself, it will pass those commands through the pipe to its child process's_run_update()
function, which will in turn pass them toProcess.run_command()
.I've started implementing this in #198
I think this proposal addresses all the problems described above. However, it brings some new downsides:
Process
instances anymore. Some will beParallelProcess
instances. We don't want to makeParallelProcess
inherit fromProcess
because it doesn't know enough to do things like generate composites, and adding those missing capabilities is overkill.Process
instances in the store hierarchy is very counter-intuitive. The biggest breaking change is probably removingEngine.parallel
, but I doubt anyone relies on that.