flo-schu / pymob

Python model building platform
MIT License
0 stars 0 forks source link

Feature: Implement more flexible control over data dimensionality #6

Open flo-schu opened 9 months ago

flo-schu commented 9 months ago

Currently pymob only supports datasets, where variables all have the same dimensionality. For the case-study reversible_damage. This is not the case, when more substances are included simultaneously. This becomes even less feasible when datasets with potentially 100-10,000s of gene expression signals are included. Here more granular control is needed, which data variables have what dimensionality.

Ad hoc this should be solved with a workaround that is implemented in the solver, which is specific to the problem of reversible_damage, because it involves major breaking changes to the current API, but in general it would be desirable to have this capability.

The implementation should be rebased on #2, because it can well use the new config API and the other refactorings, which have been implemented.

flo-schu commented 9 months ago

Suggestion for cfg files

[simulation]
input_files = params.json
dimensions = id substance time
modeltype = stochastic
# exclude 2nd Aulhorn experiment
substance_range = 0 inf
apical_effect = lethal
hpf = 24
data_variables = cext cint nrf2 lethality
data_variables_max = nan nan nan 1
data_variables_min = 0 0 0 0
seed = 1

[dataset.dimensions]
# describe the dimensionality of the dataset. This setting is essential to 
# automatically assemble, scale and compare simulation and observation datasets.
cext = id substance time
cint = id substance time
nrf2 = id time
lethlity = id time

[dataset.dimension.coordinates]
# optionally give the coordinates of the dimensions. This setting also 
# modifies which datapoints of the dataset will be used for comparison
# e.g. time=24 will only include observations after 24 h in the dataset.
substance = diuron diclofenac naproxen
flo-schu commented 7 months ago

This also addresses the problem that currently Simulation class cannot be used anymore for a vanilla simulation without data !!!!

The problem is that too many methods used in __init__ were developed under the reversible-damage project branch.

Consider using only a minimal init and rather defining methods to deliver the desired function. Such as

instead of having to specify it manually.

REMEMBER! It should always be easy to use the tool.

flo-schu commented 2 months ago

coordinates are currently not specified via the config backend, but are are extracted from the data.