Closed 1fish2 closed 2 years ago
My 2 cents on the deriver-process-factory-composite-compartment-generator inheritance question:
First some definitions:
processes
and topology
.generate_topology
and generate_processes
functions.Given these definitions, Composite isn't a class--it's just a dictionary with particular keys. The class is a Factory (what we used to call Generator). A Process knows how to create a simple Composite from itself, so it inherits from Factory.
As far as code changes, this means that we combine the Composite class into Factory. Since Composite is the only class that inherits from Factory, this shouldn't be a problem. Then, we have the following hierarchy:
Factory dict
↑ ↑
Process Composite
↑
Deriver
Where Factory objects can produce Composites.
@eagmon @prismofeverything @1fish2 what do you think?
I think @U8NWXD's proposed definitions make sense, and would be fine re-combining Composite and Factory. Separating Composite from Factory did seem a bit unnecessary, but I could also see other variations of Factory being made in the future.
Regarding the Process derivers()
method -- I'm in support of removing it entirely, and then removing the generate_derivers()
function so that all derivers have to be declared explicitly in a Composite. The only thing holding me back is all the use of derivers
in vivarium-cell. I propose that rather than going through the labor of converting all of those, we keep the older version of vivarium-core for vivarium-cell and deprecate vivarium-cell. I can pull the multibody process out to a separate library. Other processes can be moved to a separate "playground" repository as the become needed.
Edit: I started vivarium-pymunk -- https://github.com/vivarium-collective/vivarium-pymunk. Metabolism has also been moved to its own library -- https://github.com/vivarium-collective/vivarium-cobra. The rest of the processes in vivarium-cell can be handled as-needed, so I say we go ahead and remove automatic derivers.
Edit: automatic derivers are removed in commit 4905ec65657031274e0fc0747fb0bc15c2f41dd8
An additional breaking change that I have considered is to make the helper functions in composition.py
use the more typical function arguments rather a settings
dictionary with argument keywords. Any thoughts on this? If we are making breaking changes, it might be worth getting them all in at the same time.
**kwargs
for settings
might be more convenient. That changes the way of declaring and documenting settings item types and defaults to ordinary:
def simulate_experiment(experiment: SomethingHere, *, total_time: int = 10, return_raw_data: bool = False, timeline: SomethingElseHere = None, **kwargs): ...
Otherwise we'd need to use TypedDict
to declare the setting value types and prose to document their defaults, and we couldn't make some settings like total_time
required.
Thanks, @U8NWXD for your 2¢! So a Deriver is for computing derived values from the current state, not for modeling, and it has to run after all Processes in order to get the Processes' updated outputs.
Q. Are Processes first-order difference equations (computing new state values as a function of the previous time step's state values) while Derivers compute other state values as a function of current values? And Derivers output state variables must not overlap with other Derivers and with any Processes?
s1[t] = process1(s1[t - 1], s2[t - 1])
s2[t] = deriver1(s1[t])
Q. Should Derivers be topo-sorted like spreadsheet formulas so they can get other derivers' output values?
A Composite is a collection of Processes wired together with a topology. This is just a dictionary with keys processes and topology. A Factory produces Composites with its generate_topology and generate_processes functions.
To clarify, a Factory builds a Composite (of Processes) from a Composite specification which is just a dictionary?
Can we meet to discuss process-factory-composite-compartment inheritance? I'm still fuzzy on it.
@1fish2 -- Yes, we should meet to talk about process-factory-composite-compartment. Everything that you mentioned above sounds correct, except that Deriver values can overlap with other Processes' values. In fact, some systems depend on that -- for example, a process that increases counts of a molecule based on its concentrations depends on a deriver that updates concentrations based on counts. But we should clear up any ambiguity, and get more refined usage of our terms.
@U8NWXD -- thinking about Composite as a dict with particular keys ('processes'
and 'topology'
) is actually very clarifying. Composition then is just a merge operation of Composites. I am still not fully satisfied with the merge
operation I put in there, but the dictionary approach makes it much easier to think about. I have wondered about merging factories -- can we wire the Composite output of a Factory in another Factory? I think easy merging will be very helpful, so that Composites can be incrementally built up rather than re-defined every time.
Derivers are really important. They are like Processes with a timestep of 0, which only run after the dynamic Processes to derive values. For example, concentrations from counts or volume from mass. Currently, all of the derivers run in order after every Processes update. Ideally there should be a topo-sorted approach that triggers Derivers only when their input values are updated. I have removed the automatic derivers in #49, so Processes no longer declare their required derivers. We should make it easy to pre-bundle Processes and Derivers together in Composites/Factories -- this relates to the easy merge operations described above.
Checking off Can we move the generated doc files out of git?
because the API stubs are no longer auto-generated
@eagmon why are we closing this? I don't think we've finished making all these decisions
@U8NWXD Maybe we should meet to look at it very soon? The paper will be online probably this week and that will come with more traffic, including to issues.
Here are my suggestions for how to tackle the remaining tasks here for 1.0:
- [x] Finalize which methods are class methods.
We can stop ignoring no-self-use
in .pylintrc
to flag all the cases where methods accept self
but don't use it.
- [x] Which definitions should be private? Use
__all__
or_xyz
names to declare that.- [ ] Which modules should be private?
I prefer _xyz
names. I'd propose going through core
and searching for every definition to see if it can be made private. I don't think we have any modules to make private.
- [x] Which Python versions to officially support? 3.6 - 3.9? Declare that in
setup.py
.
I think we should only support Python versions that we run tests on. Right now that's only 3.8, but I think we should see if our tests pass on the later Python versions too.
- [x] Use
abc
abstract base classes & methods anywhere else?
We currently use this for Process
(so Step
is by extension an abstract class) and Composer
. I think those are the only classes that we want to make abstract.
- [x] The library has 130
isinstance()
checks!isinstance()
should be used sparingly since it makes for fragile code.
I think we can release 1.0 without fixing this
Some questions to address before publishing the Vivarium paper and inviting library users we can't consult before making incompatible changes:
__all__
or_xyz
names to declare that.setup.py
.abc
abstract base classes & methods anywhere else?isinstance()
checks!isinstance()
should be used sparingly since it makes for fragile code.