vivarium-collective / vivarium-core

Core Interface and Engine for Vivarium
https://vivarium-core.readthedocs.io/
Apache License 2.0
23 stars 2 forks source link

Engine(Process) #193

Closed prismofeverything closed 2 years ago

prismofeverything commented 2 years ago

During discussions about how to parallelize groups of processes that share local state (instead of just a single process at a time) we realized that Engine itself could be a process, whose parameters are the arguments to Engine, that could then be parallelized as a whole. This could represent a kind of "unit of simulation" as a bundle of related processes that could then be distributed across machines (eventually) and cores (presently).

In order to pull off this feat, we needed to implement the Process interface within Engine, specifically:

I give each method a section here.

init

A mildly drastic change to the interface of Engine here, instead of taking keywords it instead takes a single parameters dictionary like any other process. This means the loss of typing hints, for the sake of consistency. We explored other options but ultimately none of them allowed us to parsimoniously retain the keyword args. On the upside you can now instantiate an Engine directly with a Composite (which is itself a dict with specific keys) like so:

composer = SimulationComposer()
engine = Engine(composer.generate())

calculate_timestep

The timestep for the Engine process does not need to adhere to its subprocesses, but it can. If you provide a timestep parameter to Engine it will run its simulation for that timestep, however many times each of its constituent processes happens to run in that time. Otherwise it will run for as long as it needs to produce the next update, so essentially the min() of all the subprocess timesteps.

ports_schema

The ports are accomplished by a new concept (and a new potential key in Composite) called intertopology. The intertopology is a mapping between the keys in the Engine's ports and the paths to the inner store that these outer keys point to. It has the same form as a topology and all the same features, ie _path, * etc. Part of this PR was pulling out the topology functionality from Store and making it operate also on equivalent dictionaries (this is marked by the creation of project_topology, which converts a dict according to the given topology, and it's new friends get_path, establish_path, and outer_path, which all have equivalents in Store. Perhaps we implement the Store methods from these? Something to explore in a future PR).

next_update

With all this in place, next_update is fairly simple. Invert all the states according to the intertopology, set those states in the Engine, then run it for the given time step, collecting all the updates and returning them together as a single update. This also means that you can return lists of updates rather than a single update from a process now, in order to support the idea that we want to replay these updates onto the outer Engine's states to make use of their updaters, rather than return the internal updated states directly.

conclusion

All input welcome! It's a rather significant change but also seems "right" somehow, in that the design is more expressive in a way that uses everything that was already there. We solved it by linking up existing elements. Engine is just another process.

I imagine there may be considerations I didn't cover here, let me know if you see any.


By creating this pull request, I agree to the Contributor License Agreement, which is available in CLA.md at the top level of this repository.

eagmon commented 2 years ago

Glad to see this up! I am wondering what is still left to do in later PRs. It looks like this PR focused on creating separate EngineProcess instances and wiring them together. I think there are a few remaining features to making this fully configurable and easy-to-use:

  1. Giving Engine a single Store with "cuts" that break its branches into sub-composites that can each go into their own EngineProcess -- what is required for this?
  2. Division -- have you tested division of an EngineProcess? An update would have to go to the super-Engine, which would launch two sub-Engines.
  3. glob schema rewiring -- how configurable are these across an Engine interface?
  4. Anything else come up while you were working on this?
eagmon commented 2 years ago

Going through the mypy checks, we are left with one interesting error. Engine.next_update is returning its Update as a list, which comes from the accumulated intertopology_updates in Engine.run_for. Process.next_update has its Update return type set as a dict. How do we reconcile this -- should updates also be lists? This might break assumptions.

eagmon commented 2 years ago

Also got pylint down to one error -- Engine.run_for has gone over the limit of 65 statements, it is at 71. So we should look for ways to break it up or reduce it.

prismofeverything commented 2 years ago

Cool! Let's see....

  1. The automated cutting is going to be a whole feature on it's own.... I have some ideas, but it is non-trivial. Let's discuss?
  2. Division requires a Composer that generates the Engine.... as long as we have that I can't imagine it not working, but it's good to verify! I'll make a test for it.
  3. There is a test for it in engine_tests.py that remaps a variable inside a glob schema.... it's as flexible as glob schemas anywhere else, as it uses the same functionality. As to whether that is as expressive as it could be, that is another consideration but orthogonal really. Will be good to circle back around on that some day.
  4. Lots of ideas, but all leading towards the fully distributed implementation. I think this sets the stage well for that.

As for updates being lists vs not, I set it up so you can do either, so we may want to make the type an option for those? Or we can say all updates are always lists, but that would break a lot of backward compatibility, so my tendency would be to just support both.

I'll take a look at the linting, thanks for the feedback!

eagmon commented 2 years ago

working through the tests. Updates are now a Union[dict, list] to make mypy happy. I broke off a sub-method from Engine.run_for called Engine._run_process() which tries to run a single process. So pylint is happy too. I think that was the last hurdle -- just minor documentation tweaks.

prismofeverything commented 2 years ago

Further discussions have lead to the instigation of a whole new version of Vivarium -- 2.0. We will use this branch as a starting point (creating a new development branch with tests/actions etc), but this PR will be closed for now.