The repository for the development of the extension to PEtab for model selection, including the additional file formats and Python 3 package.
The Python 3 package provides both the Python 3 and command-line (CLI)
interfaces, and can be installed from PyPI, with pip3 install petab-select
.
Further documentation is available at http://petab-select.readthedocs.io/.
There are example Jupyter notebooks for usage of PEtab Select with
in the doc/examples
directory.
AIC
: https://en.wikipedia.org/wiki/Akaike_information_criterion#DefinitionAICc
: https://en.wikipedia.org/wiki/Akaike_information_criterion#Modification_for_small_sample_sizeBIC
: https://en.wikipedia.org/wiki/Bayesian_information_criterion#Definitionforward
: https://en.wikipedia.org/wiki/Stepwise_regression#Main_approachesbackward
: https://en.wikipedia.org/wiki/Stepwise_regression#Main_approachesbrute_force
: Optimize all possible model candidates, then return the model
with the best criterion value.famos
: https://doi.org/10.1371/journal.pcbi.1007230Note that the directional methods (forward, backward) find models with the smallest step size (in terms of number of estimated parameters). For example, given the forward method and a predecessor model with 2 estimated parameters, if there are no models with 3 estimated parameters, but some models with 4 estimated parameters, then the search may return candidate models with 4 estimated parameters.
Column or key names that are surrounding by square brackets
(e.g. [constraint_files]
) are optional.
A YAML file with a description of the model selection problem.
format_version: [string]
criterion: [string]
method: [string]
model_space_files: [List of filenames]
[constraint_files]: [List of filenames]
[predecessor_model_files]: [List of filenames]
format_version
: The version of the model selection extension format (
e.g. 'beta_1'
)criterion
: The criterion by which models should be compared (e.g. 'AIC'
)method
: The method by which model candidates should be generated
(e.g. 'forward'
)model_space_files
: The filenames of model space files.constraint_files
: The filenames of constraint files.predecessor_model_files
: The filenames of predecessor (initial) model
files.A TSV with candidate models, in compressed or uncompressed format.
model_subspace_id |
petab_yaml |
[sbml ] |
parameter_id_1 |
... | parameter_id_n |
---|---|---|---|---|---|
(Unique) [string] | [string] | [string] | [string/float] OR [; delimited list of string/float] | ... | [string/float] OR [; delimited list of string/float] |
model_subspace_id
: An ID for the model subspace.petab_yaml
: The PEtab YAML filename that serves as the base for a model.sbml
: An SBML filename. If the PEtab YAML file specifies multiple SBML
models, this can select a specific model by model filename.parameter_id_1
...parameter_id_n
: Parameter IDs that are specified to
take specific values or be estimated. Example valid values are:
0.0
1.0
estimate
0.0;1.1;estimate
(the parameter can take the values 0.0
or 1.1
,
or be estimated according to the PEtab problem)A TSV file with constraints.
petab_yaml |
[if ] |
constraint |
---|---|---|
[string] | [SBML L3 Formula expression] | [SBML L3 Formula expression] |
petab_yaml
: The filename of the PEtab YAML file that this constraint
applies to.if
: As a single YAML can relate to multiple models in the model space file,
this ensures the constraint is only applied to the models that match
this if
statementconstraint
: If a model violates this constraint, it is skipped during the
model selection process and not optimized.Here, the format for a single model is shown. Multiple models can be specified as a YAML list of the same format.
The only required key is the PEtab YAML, as a model requires a PEtab problem.
All other keys are maybe required, for the different uses of the format (e.g.,
the report format should include estimated_parameters
), or at different
stages of the model selection process (the PEtab-compatible calibration tool
should provide criteria
for model comparison).
[criteria]: [Dictionary of criterion names and values]
[estimated_parameters]: [Dictionary of parameter IDs and values]
[model_hash]: [string]
[model_id]: [string]
[parameters]: [Dictionary of parameter IDs and values]
petab_yaml: [string]
[predecessor_model_hash]: [string]
[sbml]: [string]
criteria
: The value of the criterion by which model selection was
performed, at least. Optionally, other criterion values too.estimated_parameters
: Parameter estimates, not only of parameters specified
to be estimated in a model space file, but also parameters specified to be
estimated in the original PEtab problem of the model.model_hash
: The model hash, generated by the PEtab Select library.model_id
: The model ID.model_subspace_id
: Same as in the model space files.model_subspace_indices
: The indices that locate this model in its model
subspace.parameters
: The parameters from the problem (either values
or 'estimate'
) (a specific combination from a model space file, but
uncalibrated).petab_yaml
: Same as in model space files.predecessor_model_hash
: The hash of the model that preceded this model
during the model selection process.sbml
: Same as in model space files.Several test cases are provided, to test the compatibility of a PEtab-compatible calibration tool with different PEtab Select features.
The test cases are available in the test_cases
directory, and are provided in
the model format.
Test ID | Criterion | Method | Model space files | Compressed format | Constraints files | Predecessor (initial) models files |
---|---|---|---|---|---|---|
0001 | (all) | (only one model) | 1 | |||
00021 | AIC | forward | 1 | |||
0003 | BIC | all | 1 | Yes | ||
0004 | AICc | backward | 1 | 1 | ||
0005 | AIC | forward | 1 | 1 | ||
0006 | AIC | forward | 1 | |||
00072 | AIC | forward | 1 | |||
00082 | AICc | backward | 1 | |||
00093 | AICc | FAMoS | 1 | Yes | Yes |
1. Model M1_0
differs from M1_1
in three
parameters, but only 1 additional estimated parameter. The effect of this on
model selection criteria needs to be clarified. Test case 0006 is a duplicate
of 0002 that doesn't have this issue.
2. Noise parameter is removed, noise is
fixed to 1
.
3. This is a computationally expensive problem to solve. Developers can try a model selection initialized with the provided predecessor model, which is a model start that reproducibly finds the expected model. To solve the problem reproducibly ab initio, on the order of 100 random model starts are required. This test case reproduces the model selection problem presented in https://doi.org/10.1016/j.cels.2016.01.002 .