sebapersson / PEtab.jl

Create parameter estimation problems for ODE models
https://sebapersson.github.io/PEtab.jl/stable/
MIT License
33 stars 5 forks source link

Catalyst/PEtab integration #61

Closed TorkelE closed 11 months ago

TorkelE commented 1 year ago

Me and @sebapersson talked today about linking PEtab.jl and Catalyst.jl (https://github.com/SciML/Catalyst.jl) together.

The idea is that someone who has defined a Catalyst.jl reaction system, should be able to define the information required by PEtab.jl to create optimization problems for fitting cost functions. Basically, I would want to create an intermediary struct, PEtabModel (made-up name), so that one can run

petab_model = PEtabModel(...)
createPEtabODEProblem(petab_model, ...)

The hope is that someone who is using Catalyst.jl should be able to circumvent using files to utilise PEtab entirely. I'm really excited about this. Hopefully, we will be able to create really smooth workflows for parameter optimization of CRNs using Julia.

After our discussion, we decided that I should create a proposed structure that holds the information typically stored in the PEtab.jl files. We can then discuss how this one can be improved to better accommodate what PEtab.jl needs. Ultimately, I think having some form of struct PEtabModel will be useful. the equation is whenever we should:

Currently, I am thinking of creating something like this:

# Main struct, contains one value for each file in the PEtab format.
struct PEtabModel
    system::ReactionSystem
    experimental_conditions::Vector{PEtabExperimentalCondition}
    observables::Vector{PEtabObservable}  
    meassurments::Vector{PEtabMeassurment}
    parameters::Vector{PEtabParameter}
end

# Currently just a dictionary from parameters to their value.
# Currently indexing depends on vector indexes, however, we might want to make them dictionaries with proper IDs.
# If so, people could still provide conditions in vector form, and for these cases, we use randomly generated IDs (or the vector index as ID).
struct PEtabExperimentalCondition
    parameter_values::Dict{Num, Float64}
end

# Similar as for experimental conditions, we probably would want to base this on IDs instead of vector indexing.
struct PEtabObservable
    obs::Num
end

Alternatively, and maybe better, we have this be a DataFrame with the same information.
struct PEtabMeassurment
    exp_id::Int64    # The index of the corresponding experimental condition in experimental_conditions vector. 
    obs_id::Int64    # The index of the corresponding observable in the observables vector. 
    value::Float64
    time_point::Float64
end

# For all of these, good defaults in PEtab.jl exists. Maybe should make each type Union{Nothing,...}, with Nothing indicating that PEtabl.jl selects its default.
struct PEtabParameter
    parameter::Num
    isconstant::Bool
    value::Union{Nothing,Float64} # if isconstant==false, value=Nothing.
    lb::Float64
    ub::Float64
    prior::Distribution{Univariate, Continuous}
    scale::Symbol # :log10, :linear supported.
end

I think that it should be easy to, within Catalyst, create something like this and send to PEtab. What would be the ideal way for PEtab to receive it in though? If you have a format that you think you could work in, and ensure that that works as input to createPEtabODEProblem, then I can make a link to that, and we should be able to put an example together.

sebapersson commented 1 year ago

This looks like a good first proposal, and I think this integration would be great for allowing smooth parameter estimation in Julia. First. some general comments first (see comments on the struct):

# I think this a good structure (might be worth to call it CatalystPEtabModel at technically it is not a full 
# PEtab model as it does not follow the full PEtab format). Would also allow me to then apply an extra 
# parsing step before creating ODE-problem, so I could have 
# createPEtabODEProblem(model::CatalystPEtabModel, ...)
struct CatalystPEtabModel
    system::ReactionSystem
    experimental_conditions::Vector{PEtabExperimentalCondition}
    observables::Vector{PEtabObservable}  
    meassurments::Vector{PEtabMeassurment}
    parameters::Vector{PEtabParameter}
end

#=
Looks good - but each experimental condition should have a string-id (or similar). Similar for observable 
id 
=#
struct PEtabExperimentalCondition
    parameter_values::Dict{Num, Float64}
end

#=
Each observable should be given by a string ID 
=#
struct PEtabObservable
    obs::Num
end

#=
I think a DataFrame is the best option here (but we can keep this struct to allow anyone to simulate data). 
See my comments on the fields. There are also observable parameters and PreEq ID:s (if we have a steady
state simulation), but I think we can add those later (should be straightforward).
=#
struct PEtabMeassurment
    exp_id::String # Should be a text id 
    obs_id::String    # Should be a text id 
    value::Float64
    time_point::Float64
    #=
     The sigma-parameter (measurement noise) can either be a value or a parameter in the parameters-struct.
    Has to be here, as different measurement-series (even of the same observable) can have different 
   measurement noise.
    =#
   noise_parameter::Union{Float64, Num} 
end

#=
# For all of these, good defaults in PEtab.jl exists. Maybe should make each type Union{Nothing,...}, 
with Nothing indicating that PEtabl.jl selects its default. Nothing is a good thing here for defaults :), 
note that prior should be allowed to be nothing (if no prior is given).
This looks good, but by PEtab standard replace isconstant with estimate 
=#
struct PEtabParameter
    parameter::Num
    isconstant::Bool
    value::Union{Nothing,Float64} # if isconstant==false, value=Nothing.
    lb::Float64
    ub::Float64
    prior::Distribution{Univariate, Continuous}
    scale::Symbol # :log10, :linear and :log supported.
end

Overall I think the following would work. If catalyst provides a CatalystPEtabModel then PEtab can use the information to basically create the PEtab-files (they do not have to explicitly created). Given the PEtab-files a PEtabODEProblem can easily be created. So I imagine a workflow like

petab_model = CatalystPEtabModel(...)
# I expand createPEtabODEProblem to handle CatalystPEtabModel in PEtab.jl 
petab_problem = createPEtabODEProblem(petab_model::CatalystPEtabModel, ...)

So if you could provide a CatalystPEtabModel on the formt above for a simple model, say; k1 : X -> Y k2 : Y -> X Where we aim to estimate k1, k2, assuming normally distributed measurements of Y with sigma ~ N(0, 1) I can expand createPEtabODEProblem to handle CatalystPEtabModel.

Later in order to create a robust test-suite we can recreate the PEtab test-suite using reaction-system.

Overall, what do you think?

TorkelE commented 1 year ago

This looks good, here is the altered CatalystPEtabModel format corresponding to your suggestions right now.

struct CatalystPEtabModel
    system::ReactionSystem
    experimental_conditions::Dict{String,PEtabExperimentalCondition}
    observables::Dict{String,PEtabObservable}  
    meassurments::DataFrame
    parameters::Vector{PEtabParameter}
end

struct PEtabExperimentalCondition
    parameter_values::Dict{Num, Float64}
    id::String
end

struct PEtabObservable
    obs::Num
    id::String
end

struct PEtabParameter
    parameter::Num
    estimate::Bool
    value::Union{Nothing,Float64} 
    lb::Union{Nothing,Float64}
    ub::Union{Nothing,Float64}
    prior::Union{Nothing,Distribution{Univariate, Continuous}}
    scale::Union{Nothing,Symbol} # :log10, :linear and :log supported.
end

I will create an example as well that we could use to try and fit data. I agree, let's do steady state after we have a working methodology without.

My only question is regarding having meassurments as a DataFrame. When we have it as an explicit struct, don't we guarantee that it has a certain number of fields and that these are filled, while a DataFrame could contain additional fields (or miss some)? Is there a way to define a DataFrame type where it has a certain number of fields? If not I guess we could add a quick check to the data frame ensuring it contain the right informtation.

TorkelE commented 1 year ago

Also, exactly how do you want the u0 defined? For a species X, should I simply create a parameter X which denotes its initial condition?

sebapersson commented 1 year ago

I don't think we can restrict a DataFrame, on the other side, writing a function to check the DataFrames is straightforward.

Good question, preferably as a state_map (which is already how we do it), i.e on the format:

state_map = [x => 207.6*k1, y => 0.0]

That is, for each model state either a numerical value, or an equation (depending on model parameters, which can be an initial value) are okay to supply.

TorkelE commented 1 year ago

Sounds good.

Should I make the initial conditions part of the PEtabExperimentalCondition struct, e.g. expand it to:

struct PEtabExperimentalCondition
    parameter_values::Dict{Num, Float64}
    u0::Vector{Pair{Num.Union{Float64,Num}}}
end

? Here I put the first value to a Num, but could be a Symbol or something else as well, was a bit unsure.

sebapersson commented 1 year ago

I think actually u0 should be something entirely separate (which is then provided into CatalystPEtabModel). I am thinking about doing as is done in ModellingToolkit where when building on ODEProblem the user can if they want to provide a state-map (as here https://docs.sciml.ai/ModelingToolkit/stable/tutorials/ode_modeling/).

And for parameter values the first value being Num looks good.

TorkelE commented 1 year ago

I am not entirely sure what you mean, You mean something like this where u0 is a separate field?

struct CatalystPEtabModel
    system::ReactionSystem
    experimental_conditions::Vector{PEtabExperimentalCondition}
    observables::Vector{PEtabObservable}  
    meassurments::Vector{PEtabMeassurment}
    parameters::Vector{PEtabParameter}
    u0::Vector{Pair{Num.Union{Float64,Num}}}
end

Then if one wants u0 to varry between various experimental conditions one set that specific u0 value to be a parameter, the value which depends on experimental_conditions["ExpX"].parameter_values?

Also, a second question. If you have e.g. parameter p depend on experimental_conditions["ExpX"].parameter_values[p], what should p be in the parameters field? I presume that it must have estimate=true. However, can it still have a value (which is used as the default when experimental_conditions["ExpX"].parameter_values does not contain the parameter)?

sebapersson commented 1 year ago

Yes, this is how I see the struct. And yes, if we want some u0 to change between conditions it is better if we make that specific u0 value to change a parameter. I think this is a good solution as, at least in my experience, typically only a few u0 values are changed between experimental conditions, so instead of having to provide an u0 per experimental condition, it is probably better to specify the u0:s that change by a parameter value.

Now a complication here, which I think you touch on with the second question, is that if an initial value is a a parameter how do we specify that? If it is a parameter that is constant or we want to estimate is should be specified in PEtabParameter. However, if it is a control-parameter (dictating an experimental condition) it should not be specified in PEtabParameter. In the PEtab standard control parameters are not a part of PEtabParameter (as they are control parameters and not yet allowed to be estimated), rather the user must provide a value for the control parameter for each experimental condition. So I am thinking that if a user provide a new parameter as initial value control parameter, then we should enforce the user provides a value on that parameter for each experimental condition. What do you think?

TorkelE commented 1 year ago

(I work in a category of systems where the only difference between experimental conditions is that one changes all the non-trivial (non-0) u0 values, however, for the greater good I don't mind going with your solution, seems like it should work out)

So, in essence, we have 3 types of parameter values? 1) Parameters that we would like to estimate (that might have a prior distribution and bounds). 2) Parameters that have a known value. 3) Parameters that have known values, but with these varying between each experimental condition. Here, the first two would be required to be specified in the PEtab parameters field/file, while the third would be given in the experimental_conditions field/file (and if it is not provided for each experimental condition, and error would be thrown). It seems like a weird dissonance that (2) is provided in parameters but (3) is not (however, I can also see why, as they have to be provided elsewhere).

Still, I think we have a fully working format here, so maybe we go with this to make something that works, and then we can discuss whenever it makes sense to throw things around a bit later on?

sebapersson commented 1 year ago

I see what you mean with 2 and 3, and typically in the PEtab standard 2 does not have to specified explicitly (as often the constant values for these are already set in the SBML file), so if it could be possible to set constant parameter when specifying the reaction system then given that, it would not be needed to specify all constant parameters in Parameters?

And I agree, we can go with this, get something that works, and then make potential changes-

sebapersson commented 1 year ago

I see what you mean with 2 and 3, and typically in the PEtab standard 2 does not have to specified explicitly (as often the constant values for these are already set in the SBML file), so if it could be possible to set constant parameter when specifying the reaction system then given that, it would not be needed to specify all constant parameters in Parameters?

And I agree, we can go with this, get something that works, and then make potential changes.

TorkelE commented 1 year ago

Technically it would be possible to parse the provided ReactionSystem, and substitute in parameter values before the system is sent over to PEtab. However, ideally, I would like to keep these separate, as this is how they are handled across the entire SciML system. I also think there's a decent number of cases where this is advantageous (e.g. in many cases system input is modelled as a parameter, and a callback is used to change this value to signal activation. This is no longer possible if a parameter is substituted in).

I will keep things as suggested now, and then we can add this to the list of things to consider once we have a methodology sorted out.

TorkelE commented 1 year ago

Does this work?

### Preparations ###

# Fetch required packages.
using Catalyst, OrdinaryDiffEq, Distributions, DataFrames, Plots

# Prepares the model simulation conditions.
rn = @reaction_network begin
    (p,d), 0 <--> X
    (1,k), X <--> Y
end
u0 = [1.0, 0.0]
tspan = (0.0, 10.0)
p = [1.0, 0.2, 0.5]

### Declares CatalystPEtabModel Structure ###

# The conditiosn for an experiment.
struct PEtabExperimentalCondition
    parameter_values::Dict{Num, Float64}
    id::String
end
# An observable value.
struct PEtabObservable
    obs::Num
    id::String
end
# A parameter.
struct PEtabParameter
    parameter::Num
    estimate::Bool
    value::Union{Nothing,Float64} 
    lb::Union{Nothing,Float64}
    ub::Union{Nothing,Float64}
    prior::Union{Nothing,Distribution{Univariate, Continuous}}
    scale::Union{Nothing,Symbol} # :log10, :linear and :log supported.
end
# A full PEtab model.
struct CatalystPEtabModel
    system::ReactionSystem
    experimental_conditions::Dict{String,PEtabExperimentalCondition}
    observables::Dict{String,PEtabObservable}  
    meassurments::DataFrame
    parameters::Vector{PEtabParameter}
    u0::Vector{Pair{Num,Union{Float64,Num}}}
end

### Performs Experiment ###

# Meassurment settings.
meassurment_times = 2.0:2.0:10.0
meassurment_error_dist = Normal(0.0, 0.4)

# Make simulation.
oprob = ODEProblem(rn, u0, tspan, p)
sol = solve(oprob, Tsit5(); tstops=meassurment_times)

# Get meassurment values.
meassured_values_true = [sol[2, findfirst(sol.t .== mt)] for mt in meassurment_times]
meassured_values = meassured_values_true .+ rand(meassurment_error_dist, length(meassured_values_true))

# Plot the experiment an meassurments.
plot(sol; idxs=2)
plot!(meassurment_times, meassured_values; seriestype=:scatter)

### Prepares the PEtab Structure ###
@unpack X, Y, p, d, k = rn

# The system field.
system = rn

# The experimental conditions field.
exp1 = PEtabExperimentalCondition(Dict([X=>1.0, Y=>0.0, d=>0.2, k=>2.0]), "Obs1")
experimental_conditions = Dict(["Exp1" => exp1])

# The observables field.
obs1 = PEtabObservable(Y, "Obs1")
observables = Dict(["Obs1" => obs1])

# The meassurments field
nM = length(meassured_values)
meassurments = DataFrame(exp_id=fill("Exp1", nM), obs_id=fill("Obs1", nM), value=meassured_values, time_point=meassurment_times, noise_parameter=fill(0.4,nM))

# The parameters field.
par_p = PEtabParameter(p, false, nothing, 1e-2, 1e2, nothing, :log10)
par_d = PEtabParameter(d, true, 0.2, nothing, nothing, nothing, nothing)
par_p = PEtabParameter(k, true, 2.0, nothing, nothing, nothing, nothing)
parameters =[par_X, par_Y, par_p, par_d, par_p]

# The initial conditions field.
u0 = [X => 1.0, Y => 0.0]

# Creates the model
petab_model = CatalystPEtabModel(system, experimental_conditions, observables, meassurments, parameters, u0)

### Saves Model to File ###
using Serialization
serialize("petab_model.jls", petab_model)

Currently, there is only one parameter to estimate, however, if that works we should be able to make an additional parameter unknown, or something that is carried in experiments. I'm also sending the petab_model.jls file (although the CatalystPEtab struct must still be declared to load it)

sebapersson commented 1 year ago

Thanks, this looks like it would work. Will try to take a look and come back this or early next week :)

sebapersson commented 1 year ago

This turned out to be easier than expected.

(Note I made some small alterations to the Structs)

include(joinpath(@__DIR__, "Catalyst_functions.jl"))

# Prepares the model simulation conditions.
rn = @reaction_network begin
    (p,d), 0 <--> X
    (1,k), X <--> Y
end

### Performs Experiment (simulate data) ###
u0 = [1.0, 0.0]
tspan = (0.0, 10.0)
p = [1.0, 0.2, 0.5]
meassurment_times = 2.0:2.0:10.0
meassurment_error_dist = Normal(0.0, 0.4)
oprob = ODEProblem(rn, u0, tspan, p)
sol = solve(oprob, Tsit5(); tstops=meassurment_times)
meassured_values_true = [sol[2, findfirst(sol.t .== mt)] for mt in meassurment_times]
meassured_values = meassured_values_true .+ rand(meassurment_error_dist, length(meassured_values_true))

### Prepares the PEtab Structure ###
@unpack X, Y, p, d, k = rn

# The system field.
system = rn

# The experimental conditions field.
exp1 = PEtabExperimentalCondition(Dict([d=>0.2, k=>2.0]))
experimental_conditions = Dict(["Exp1" => exp1])

# The observables field.
obs1 = PEtabObservable(Y, :lin, 0.4)
observables = Dict(["Obs1" => obs1])

# The meassurments field
nM = length(meassured_values)
meassurments = DataFrame(exp_id=fill("Exp1", nM), obs_id=fill("Obs1", nM), value=meassured_values, time_point=meassurment_times, noise_parameter=fill(0.4,nM))

# The parameters field (we are going to create a nice constructor here)
par_p = PEtabParameter(p, true, nothing, 1e-2, 1e2, nothing, :log10)
par_d = PEtabParameter(d, false, 0.2, nothing, nothing, nothing, nothing)
par_k = PEtabParameter(k, false, 2.0, nothing, nothing, nothing, nothing)
petab_parameters = [par_p, par_d, par_k]

# The initial conditions field.
state_map = [X => 1.0, Y => 0.0]

# Creates the model
petab_model = readPEtabModel(system, experimental_conditions, observables, meassurments, 
                             petab_parameters, state_map, verbose=true)

# Can now easily be made into PEtabODEProblem
petabProblem = createPEtabODEProblem(petab_model)
p = [log10(20)]
f = petabProblem.computeCost(p)
∇f = petabProblem.computeGradient(p)
Δf = petabProblem.computeHessian(p)

I added functions to parse the PEtab-input structs into the PEtab-file format (as DataFrames). Given that, I could build a PEtabODEProblem from the reaction system. Overall, with this approach it should also be easy to add additional features such as stady-state conditions, noise parameters etc. (basically recreate the PEtab test-suite).

Going forward, in order to create a smooth user experience I think we need to figure out how to specify initial values, and how to set default parameter values (parameters which are constant and are not control-parameters). An example, currently if I want to estimate initial-value I have to do the following when defining the reaction system:

rn = @reaction_network begin
    @parameters a0 b0 # Initial values to estimate must be parameters 
    (k1), A --> B
    (k2), B --> A
end

@unpack A, B, k1, k2, a0, b0 = rn
state_map = [A => a0, B => b0]

While I imagine something like this would be smoother

rn = @reaction_network begin
    @parameters a0 b0 # Initial values to estimate must be parameters 
    @species A(t)=a0 B(t)=b0 # Only specify for species having non-zero initial value 
    (k1), A --> B
    (k2), B --> A
end

Likewise, instead of setting constant non-control parameter values via PEtabParameter I think it would be nice to set these directly when defining the reaction system. This is how it is done typically for PEtab-model (default parameter values or the initial-value equation is set in the SBML-file). For this to work, however, I need to be able to extract the default values from the reaction system (to properly setup the functions needed by PEtab.jl), is it something that can be done? Further, any more input on this?

TorkelE commented 1 year ago

Looks good. Is there somewhere where I can check the updated structs?

It is possible to set default values of parameters and initial conditions already: https://docs.sciml.ai/Catalyst/stable/catalyst_functionality/dsl_description/#dsl_description_defaults Not sure if it is possible to make initial conditions parameters though. Whether or not that is easy to do depends on ModelingToolkit, I could check this.

I'm mixed with using default parameter values within the ReactionSystem as the way to define values that are neither estimated nor varies between experimental settings. From the workflow where you create a ReactionSystem, create a PEtabModel, and then calibrate the parameters it makes sense. But if you want to e.g.

TorkelE commented 1 year ago

It should also be noted that some parameter properties can be defined for parameters via metadata: https://docs.sciml.ai/ModelingToolkit/stable/basics/Variable_metadata/#Bounds https://docs.sciml.ai/ModelingToolkit/stable/basics/Variable_metadata/#Mark-parameter-as-tunable Metadata can be added to parameters in Catalyst already (and we can add more fields if we want to) e.g.

@reaction_network begin
    @parameters p1 [mymetadata=true] p1 [someothermetadata=2.0] 
end

For now we might want to ignore these fields and make it work without, and then we can add functionality that infers these properties if they exist.

sebapersson commented 1 year ago

Yes, the updated structs can be found here: https://github.com/sebapersson/PEtab.jl/blob/Catalyst_integration/test/Catalyst_functions.jl. (note I removed PEtabCatalystModel as I figured out how to process these structs directly into a PEtabModel)

Regarding your points on parameter values I see what you mean. So we have that setting default values in non-ideal for workflows, while setting constant parameters via PEtabParameter is to clunky. Would it then be possible to set values for constant parameters via a parameter-map - like lets say we would want to set a constant value for k1 (and set k2 later via PEtabParameter)

rn = @reaction_network begin
    @parameters a0 b0 # Initial values to estimate must be parameters 
    (k1), A --> B
    (k2), B --> A
end

@unpack A, B, k1, k2, a0, b0 = rn
state_map = [A => a0, B => b0]
parameter_map [k1 => 3.0]

Then I could accept parameter_map when building the PEtab model and take this constant into consideration when building the ODEProblem.

Regarding parameter dependent initial values, do you think it is better to use a state-map, or specify the relationship in the reaction system?

TorkelE commented 1 year ago

Setting it via a parameter map like that seems good. It is also similar to how it is done with e.g. ODEProblems, so should be quite intuitive for most users.

For the states, I am currently leaning towards defining this in the reaction system. Having initial conditions being defined by parameters is something I can see being useful under other circumstances as well, so seems good to make it a general thing. I'm checking how this should work within the ModelingToolkit framework right now though. If it turns out to be a mess we might have to go with defining it in the state map and it being a feature within PETab only.

sebapersson commented 1 year ago

Sounds good, I will then add a parameter-map.

I will also code up the PEtab test-cases using Catalyst (too see if there is any useful feature we do not currently cover).

TorkelE commented 1 year ago

Setting intial conditions as parameters should work well:

using Catalyst
rn = @reaction_network begin
    @parameters x0
    @species X(t)=x0
    (p,d), 0 <--> X
end
oprob = ODEProblem(rn, [], (0.0,10.), [:p=>1.0, :d => 3.0, :x0=>100.0])
sol = solve(oprob)

The value can be retrieved via e.g.

@unpack X = rn
Catalyst.get_defaults(rn)[X]

or found in

X.val.metadata[Symbolics.VariableDefaultValue]
sebapersson commented 1 year ago

Then I say we go with setting initial-values directly in the reaction system, I also managed to this approach working with PEtab.jl and I think it looks nice (below is the first PEtab test-case, note I added constructor for PEtabParameter):

#=
    Recrating PEtab test-suite for Catalyst integration to robustly test that
    we support a wide arrange of features for PEtab integration
=#

using Test
include(joinpath(@__DIR__, "..", "Catalyst_functions.jl"))

# Define reaction network model 
rn = @reaction_network begin
    @parameters a0 b0
    @species A(t)=a0 B(t)=b0
    (k1, k2), A <--> B
end

# Measurement data 
measurements = DataFrame(exp_id=["c0", "c0"],
                         obs_id=["obs_a", "obs_a"],
                         time_point=[0, 10.0],
                         value=[0.7, 0.1],
                         noise_parameter=0.5)

# Single experimental condition                          
experimental_conditions = Dict(["c0" => PEtabExperimentalCondition(Dict())])

# PEtab-parameter to "estimate"
petab_parameters = [PEtabParameter(:a0, value=1.0, scale=:lin),
                    PEtabParameter(:b0, value=0.0, scale=:lin),
                    PEtabParameter(:k1, value=0.8, scale=:lin),
                    PEtabParameter(:k2, value=0.6, scale=:lin)]

# Observable equation                     
@unpack A = rn
observables = Dict(["obs_a" => PEtabObservable(A, :lin, 0.5)])

# Create a PEtabODEProblem 
petab_model = readPEtabModel(rn, experimental_conditions, observables, measurements,
                            petab_parameters, verbose=true)
petab_problem = createPEtabODEProblem(petab_model)

# Compute negative log-likelihood 
nll = petab_problem.computeCost(petab_problem.θ_nominalT)
@test nll ≈ 0.84750169713188 atol=1e-3

Overall, integrating PEtab with Catalyst seem to work nicely. Next I could expand the importer and structs to cover additional features (e.g. steady-state simulations) by recreating the PEtab test-suite, and after that I propose we see if we should change some of the structs? (I am currently not entirely happy with how we specify measurement data, feels a bit clunky).

TorkelE commented 1 year ago

Looks good!

Quick questions:

sebapersson commented 1 year ago

Good questions.

  1. I just set the values for the test (Currently the user can choose whether or not to provide values for parameters that are to be estimated).
  2. This is a mistake from my part (when I set things up). In (A, :lin, 0.5) 0.5 is the noise-formula, which as we here have to be 0.5 (as we know the measurement-noise in this case). So in this case I do not need to specify noise (per PEtab standard) in measurements. In case I would have a noise-formula like (A, :lin, A * noiseParameter1) then in measurements I would have to provide noise_parameter=[a, b] where a, b can be constant values that replace noiseParameter1, or even parameters specified by PEtabParameter that replace noiseParameter1. All-to-all, I think a good solution is to have the user provide the noise-formula in the observable, and in case it is a non-constant we can force the user to also provide a noise-parameter? And exactly, lin suggests a linear scale for the difference (allowed are lin, log, log10), if we instead would have used log we would have assumed log-normal measurement error.
  3. I have not added support for this. Currently you can either have a normal-distribution (using lin), or a log-normal (using log or log10 for the difference between measured or non-measured). If you think more distributions would be relevant we can add it?
TorkelE commented 1 year ago

Sounds good. I am still a bit uncertain about (1) through. If a0, b0, k1, k2 are estimated in this case (they are, right?), does setting this values have any direct impact on what is going on?

With regard to (2). Yes, it seems that when we are done we might want to have a series of checks to ensure nothing is missing. Your suggested approach here seems good (provide the noise formula in the observables, and any quantity here that carries between measurements is given as noise parameters there.

Right now, the number (noiseFormula) in observable is a formula for the std in the normal distribution, right?

(3) To be honest, linear and log-normally distributed noise will probably cover almost all cases, and I cannot directly say there is another type of noise I'd want right now. I think it is more a question if we want to future-proof us so that the approach can work with other noise distributions when/if they get added. Would it make sense to add a field to PEtabObservable for this, and for now we just assume that is always going to be a normal distribution?

sebapersson commented 1 year ago
  1. Setting their values have no effect at all (I do not compile with respect to the set values). Rather when I extract the nominal-parameter vector petab_problem.θ_nominalT in θ_nominalT the set values can be found. So it is entirely for the convenience of testing the code against cases where we know the likelihood value.
  2. Good, then I will keep the noise formula in observable (and make it a field the user must provide).
  3. I agree with you, and I am hesitant to add more distributions unless they are truly needed. This is because in order to compute the gradient correctly for adjoint-sensitivities and Hessian approximation for the Gauss-Newton approximation at some places I rely on analytically derived formulas for the normal distribution - adding a new distributions means to hardcode some partial derivatives. Thus, I would rather opt for writing in the documentation that if the user wants to use a different distribution they should file an issue, and then I can add it.
TorkelE commented 1 year ago

Excellent, yes, putting such a note about other distributions sounds like the way to go. Didn't even realise that there was customized code for normal distributions.

If we do not want to let initial conditions be parameters that are estimated, we can provide their values just using a map when we run readPEtabModel? And for the parameter values that remain fixed throughout the simulation.

sebapersson commented 1 year ago

Yes, providing the the constant initial values like a map sounds like a good plan. And in the case the user does not provide anything for initial value I say we assume it is zero.

TorkelE commented 1 year ago

Assuming 0 if missing sounds good. I think the pieces are falling into place? I will start drafting documentation for Catalyst. Once there is a version of the interface working in PEtab.jl I might be able to help with the coding there to cover corner cases and similar stuff.

sebapersson commented 1 year ago

I agree the pieces are falling into place. I will work through the PEtab test-cases (to make sure there is nothing we miss in term of support), and add things like steady-state simulations. Have some other things this week, but should be done with this latest by the end of the week.

sebapersson commented 1 year ago

The Catalyst importer now supports all PEtab features (some features required a bit of work :) ), see the example below (where I tried to include everything):

#=
    Showing all different things we can handle with PEtab.jl 
=#

using Distributions
include(joinpath(@__DIR__, "..", "Catalyst_functions.jl"))

# Define reaction network model 
rn = @reaction_network begin
    @parameters se0
    @species SE(t)=se0 # s0 = initial value for S
    c1, S + E --> SE
    c2, SE --> S + E
    c3, SE --> P + E
end
#= 
Set a constant state (in this case E and P) and parameter (in this case c1) 
values via a state- and parameter-map respectively with a vector on the form 
Symbol(variable) => value
=#
state_map =  [:E => 1.0, :P => 0.0]
parameter_map = [:c1 => 1.0]

#=
For each simulation condtion we can specify values for states (in this case S) 
and/or model parameters (in this case :c2) via a Dict. For these controll-
parameters values must be set for each simulation condition (I must add a check 
for this). Note - c0_pre corresponds to a pre-equilibration - which is a steady-
state simulation carried out prior to matching the model against data (to mimic 
that cells are at steady-state at time zero).
=#
experimental_conditions = Dict("c0_pre" => PEtabExperimentalCondition(Dict(:S => 2.0, :c2 => 2.0)),
                               "c0" => PEtabExperimentalCondition(Dict(:S => 10.0, :c2 => 2.0)))

#=
A PEtabObservable has fields;
    observable formula : (formula for what is observed, e.g S + S)
    measurment noise distribution . (either :lin (normal) or log-normal (:log))
    noise-formula : formula for measurement noise σ

Parameters on the form noiseParameter... maps to the parameter or values specifed 
under noise_parameter in the measurement data, and parameters on the form 
observableParameter... maps to the parameters or values specifed in the 
observable_parameters column in the measurement data. This allows the user to 
specify different, for example, noise parameters per measurement to mimic a setting 
where the same observable (e.g S) might been observed in different experiments, or 
with different assays.
=#
@unpack S, P = rn
@parameters noiseParameter1_obs_S observableParameter1_obs_P observableParameter2_obs_P
observables = Dict("obs_S" => PEtabObservable(S, :log, noiseParameter1_obs_S * S), 
                   "obs_P" => PEtabObservable(observableParameter1_obs_P*P + observableParameter2_obs_P, :lin, 1.0))

#=
The column pre_eq_id corresponds to the pre-equilibration criteria, that is under which 
simulation-condition setting to carry out a steady-state simulation prior to matching the 
model against data (e.g to mimic that cells are at steady-state at time zero). The column 
observable_parameters holds the values for the observable-parameters specified in the 
observable formula, and these can be parameters (specified in PEtabParameters) or constant 
values (e.g 1.0). In case of several values these should be separated by ;. Similiar holds 
for noise_parameter - these map to the noiseParameter... values in the noise-formula, and 
these can be parameters or values. In case there are now noise-parameters, or observable-
parameters these columns can be left out (they are optional).
=#
measurements = DataFrame(exp_id=["c0", "c0", "c0", "c0"],
                         pre_eq_id=["c0_pre", "c0_pre", "c0_pre", "c0_pre"], # Steady-state pre-eq simulations 
                         obs_id=["obs_S", "obs_S", "obs_P", "obs_P"],
                         time_point=[0.0, 10.0, 1.0, 20.0],
                         value=[0.7, 0.1, 1.0, 1.5], 
                         observable_parameters=["", "", "scale_P;offset_P", "scale_P;offset_P"],
                         noise_parameter=["noise", "noise", "", ""])

#=
For a PEtab parameter to estimate the user must specify id. Then the user can choose scale 
to estimate the parameter on (:log10 (default), :lin and :log). For each parameter any 
univariate cont. prior can be set to turn the parameter estimation problem into a 
maximum a posteriori problem. If prior_on_linear_scale=true (default) the prior applies 
on the linear-scale, while if false it applies on the parameter scale (e.g log10 in case 
the user choose scale=:log10). In case the parameter is not be estimated the user can 
set estimate=false (default true). lb and ub (lower and upper bounds) can also be 
set for each parameter.
=#
petab_parameters = [PEtabParameter(:c3, scale=:log10, prior=Normal(0.0, 2.0), prior_on_linear_scale=false),
                    PEtabParameter(:se0, scale=:lin, prior=LogNormal(1.0, 0.5)), 
                    PEtabParameter(:scale_P, scale=:log10),
                    PEtabParameter(:offset_P, scale=:log10),
                    PEtabParameter(:noise, scale=:log10)]

# Given all this it is easy to set a PEtabODEProblem and then compute the cost, gradient 
# and Hessian.
petab_model = readPEtabModel(rn, experimental_conditions, observables, measurements,
                            petab_parameters, stateMap=state_map, parameterMap=parameter_map, 
                            verbose=true)
petab_problem = createPEtabODEProblem(petab_model)

# Compute negative log-likelihood 
θ = [1.0, 1.0, 1.0, 1.0, 1.0]
f = petab_problem.computeCost(θ)
∇f = petab_problem.computeGradient(θ)
Δf = petab_problem.computeHessian(θ)

Overall, we can have priors, pre-equillbration steady-state simulations, different measurement noise distributions, and estimate observable and noise parameters.

What is left is to polish it a bit (name things better) and add checks that the user input is correct. For naming and input format I have the following suggestions:

  1. We drop PEtabExperimentalCondition and simply ask the user to provide a Dict (I will then write checks to see that the provided dict assign values correctly to states and parameters in the model).
  2. Instead of experimental_conditions we refer to it as simulation_conditions
  3. For measurement data I suggest we rename or allow the following names (note | means two names are allowed) the following columns: time_point -> time, obs_id -> (observable_id | obs_id), noise_parameter -> noise_parameters, value -> measurement, exp_id -> simulation_id and pre_eq_id -> (pre_equilibration_id | pre_eq_id). I can then write checks to ensure input format is correct.
  4. For PEtabObservable I plan to add :lin as default value.

Any other input on how to specify the PEtab-structs and input data? I also plan to add Catalyst as a package-extension for PEtab.jl.

TorkelE commented 1 year ago

This looks good. I will be travelling this weekend, but starting next week and when the Catalyst extension is ready I will start to create tutorials in Catalyst.jl docs (we can put them here as well if you want).

  1. This sounds good.
  2. This might be good, but don't have any strong opinions.
  3. The renamings sounds good.
  4. Sounds good.

Looking forward, and especially with regard to the pre-equilibrium functionality. Is there a way to provide a callback to the simulation for the parameter estimation? I think some experimental layouts are already covered without callbacks, but if you have an experiment where the system remains for some number of time units in the steady state, and then the system is activated, this might require callbacks.

sebapersson commented 1 year ago

Sounds good with tutorials!

PEtab.jl has functionality for handling callbacks but callbacks are not yet support for Catalyst (only for SBML models). I agree it is a useful feature, how would you like callbacks for a Catalyst model to be specified? Is there some symbolic way?

TorkelE commented 1 year ago

In DiffEq, callbacks are typically provided with the solve(problem,...;callback=) command. In PEtab.jl the solve options seems to, generally, be provided in the odeSolverOptions structure, so that would be one alternative. However, Catalyst ReactionSystems should support events being provided in the system structure, just like ModelinGToolkit ODESystems: https://docs.sciml.ai/ModelingToolkit/stable/basics/Events/

I haven't actually played around with this functionality. However, ReactionSystems have the continuous_events and discrete_events fields, so if you think that would be better, this should work.

sebapersson commented 12 months ago

For continuous_events, i.e. scenarios where the trigger depend on the model states like if X < 1 --> Y = 1, I think the most convenient solution would be for the user to try to encode these in the system structure.

For discrete events where we know the event time (or the event time is a model parameter), it can become very involved for the user to code the event via ModellingToolkit syntax (e.g. code something like at time t1 add substrate T and at time t2 remove substrate T). Moreover, if the event trigger happens to depend on a model parameter to estimate, e.g. t == c1, then to properly compute gradients I need to convert the time-variable to dual number, and have a function that computes tstops based on the model parameters (we currently do this for the SBML importer); all-to-all handling discrete events like dosages is messy. Adding PEtab support for dosage schemes is currently discussed https://github.com/PEtab-dev/PEtab/issues/564, and to ensure correctness I think the best solution for discrete events would be to add some additional PEtab-struct like PEtabDosage, then under the hood given a PEtabDosage write appropriate discrete callbacks. What do you think?

sebapersson commented 12 months ago

On another note, I have now added checks for the PEtab input format, and on the Catalyst_integration branch added Catalyst as a package extension. I have also written tests to cover all PEtab features. I only have to add documentation (hopefully done this week) before I merge with main. So all-to-all, so I think you can now start to create tutorials.

Something that can help you is hopefully this example that should cover most features:

#=
    Showing all different things we can handle with PEtab.jl 
=#

using Catalyst
using CSV
using DataFrames
using Distributions
using PEtab

# Define reaction network model 
rn = @reaction_network begin
    @parameters se0
    @species SE(t)=se0 # se0 = initial value for S
    c1, S + E --> SE
    c2, SE --> S + E
    c3, SE --> P + E
end
#= 
Set a constant state (in this case E and P) and parameter (in this case c1) 
values via a state- and parameter-map respectively with a vector on the form 
Symbol(variable) => value
=#
state_map =  [:E => 1.0, :P => 0.0]
parameter_map = [:c1 => 1.0]

#=
For each simulation condtion we can specify values for states (in this case S) 
and/or model parameters (in this case :c2) via a Dict. For these controll-
parameters values must be set for each simulation condition. Note - c0_pre corresponds 
to a pre-equilibration - which is a steady-state simulation carried out prior to 
matching the model against data (to mimic that cells are at steady-state at time zero).
=#
simulation_conditions = Dict("c0_pre" => Dict(:S => 2.0, :c2 => 2.0),
                             "c0" => Dict(:S => 10.0, :c2 => 2.0))

#=
A PEtabObservable has fields;
    observable formula : (formula for what is observed, e.g S + S)
    measurment noise distribution . (either :lin (normal) or log-normal (:log))
    noise-formula : formula for measurement noise σ

Parameters on the form noiseParameter... maps to the parameter or values specifed 
under noise_parameter in the measurement data, and parameters on the form 
observableParameter... maps to the parameters or values specifed in the 
observable_parameters column in the measurement data. This allows the user to 
specify different, for example, noise parameters per measurement to mimic a setting 
where the same observable (e.g S) might been observed in different experiments, or 
with different assays.
=#
@unpack S, P = rn
@parameters noiseParameter1_obs_S observableParameter1_obs_P observableParameter2_obs_P
observables = Dict("obs_S" => PEtabObservable(S, noiseParameter1_obs_S * S, transformation=:log), 
                   "obs_P" => PEtabObservable(observableParameter1_obs_P*P + observableParameter2_obs_P, 1.0))

#=
The column pre_eq_id corresponds to the pre-equilibration criteria, that is under which 
simulation-condition setting to carry out a steady-state simulation prior to matching the 
model against data (e.g to mimic that cells are at steady-state at time zero). The column 
observable_parameters holds the values for the observable-parameters specified in the 
observable formula, and these can be parameters (specified in PEtabParameters) or constant 
values (e.g 1.0). In case of several values these should be separated by ;. Similiar holds 
for noise_parameter - these map to the noiseParameter... values in the noise-formula, and 
these can be parameters or values. In case there are now noise-parameters, or observable-
parameters these columns can be left out (they are optional).
=#
measurements = DataFrame(simulation_id=["c0", "c0", "c0", "c0"],
                         pre_eq_id=["c0_pre", "c0_pre", "c0_pre", "c0_pre"], # Steady-state pre-eq simulations 
                         obs_id=["obs_S", "obs_S", "obs_P", "obs_P"],
                         time=[0.0, 10.0, 1.0, 20.0],
                         measurement=[0.7, 0.1, 1.0, 1.5], 
                         observable_parameters=["", "", "scale_P;offset_P", "scale_P;offset_P"],
                         noise_parameters=["noise", "noise", "", ""])

#=
For a PEtab parameter to estimate the user must specify id. Then the user can choose scale 
to estimate the parameter on (:log10 (default), :lin and :log). For each parameter any 
univariate cont. prior can be set to turn the parameter estimation problem into a 
maximum a posteriori problem. If prior_on_linear_scale=true (default) the prior applies 
on the linear-scale, while if false it applies on the parameter scale (e.g log10 in case 
the user choose scale=:log10). In case the parameter is not be estimated the user can 
set estimate=false (default true). lb and ub (lower and upper bounds) can also be 
set for each parameter.
=#
petab_parameters = [PEtabParameter(:c3, scale=:log10, prior=Normal(0.0, 2.0), prior_on_linear_scale=false),
                    PEtabParameter(:se0, scale=:lin, prior=LogNormal(1.0, 0.5)), 
                    PEtabParameter(:scale_P, scale=:log10),
                    PEtabParameter(:offset_P, scale=:log10),
                    PEtabParameter(:noise, scale=:log10)]

# Given all this it is easy to set a PEtabODEProblem and then compute the cost, gradient 
# and Hessian.
petab_model = readPEtabModel(rn, simulation_conditions, observables, measurements,
                             petab_parameters, stateMap=state_map, parameterMap=parameter_map, 
                             verbose=true)
petab_problem = createPEtabODEProblem(petab_model)

# Compute negative log-likelihood, gradient and Hessian
θ = [1.0, 1.0, 1.0, 1.0, 1.0]
f = petab_problem.computeCost(θ)
∇f = petab_problem.computeGradient(θ)
Δf = petab_problem.computeHessian(θ)
TorkelE commented 12 months ago

Looks great! I will start re-ordering Catalyst Docs to make place for an expanded section on the inverse problem and using PEtab, and start writing the tutorials. Once PEtab.jl is updated I will start a PR and tag you so you can have a look.

TorkelE commented 12 months ago

For continuous_events, i.e. scenarios where the trigger depend on the model states like if X < 1 --> Y = 1, I think the most convenient solution would be for the user to try to encode these in the system structure.

For discrete events where we know the event time (or the event time is a model parameter), it can become very involved for the user to code the event via ModellingToolkit syntax (e.g. code something like at time t1 add substrate T and at time t2 remove substrate T). Moreover, if the event trigger happens to depend on a model parameter to estimate, e.g. t == c1, then to properly compute gradients I need to convert the time-variable to dual number, and have a function that computes tstops based on the model parameters (we currently do this for the SBML importer); all-to-all handling discrete events like dosages is messy. Adding PEtab support for dosage schemes is currently discussed https://github.com/PEtab-dev/PEtab/issues/564, and to ensure correctness I think the best solution for discrete events would be to add some additional PEtab-struct like PEtabDosage, then under the hood given a PEtabDosage write appropriate discrete callbacks. What do you think?

Interesting, I had thought that continuous callbacks would be the difficult part, and the discrete ones the easy case. When it comes to, for the user, defining discrete callbacks via ModelingToolkit, that should not be a problem. Especially for the limited number that should cover the base cases, I am more than happy to implement stuff to make this easy for the user. However, creating a PEtabDosage seems like it might be the better way to go. It will be a little weird that continuous and discrete callbacks are given differently. However, in practice, I think it will be very rare that these will actually be used. My guess is that 95%+ of cases will be:

This could easily be defined in such a PEtabDosage struct. Then, if there is a demand for more complicated discrete callbacks (or continuous callbacks) we can deal with that later.

If https://github.com/PEtab-dev/PEtab/issues/564 seems to progress slowly, we might want to go ahead and define this structure and add support for it, before main PEtab does. Then, when PEtab eventually supports this, we can align our input to match. Alternatively, us going ahead with it might push PEtab to incorporate the functionality (and use something similar to us).

The timeplan for this would depend on how messy it is for you to implement. If an implementation where dosage times do not depend on parameters is easy (and does not require lots of revamping once we want to support parameter-dependant times), then maybe we should do this first (again waiting with parameter-dependant dosage times until we see that there is a demand for them).

For the structure, maybe something like this?

# Main struct, contains one value for each file in the PEtab format.
struct PEtabModel
    ...
    doses::Vector{PEtabDose}
end

struct PEtabDose
    time::Num       # Either a formula that depends on parameters only, or a Float64
    target::Num     # The species (or parameter) that should be updated.
    value::Num      # The updated value.
end

Here, if value could either just be 1.0 (if it is a fixed value), or maybe 2p (if e.g. our dose is to double the value of the parameter p).

TorkelE commented 12 months ago

On a side note, for cases where we only represent a single species/parameter, I think that we maybe should use BasicSymbolic{Real} instead of Num. Here, Num would be reserved for equation expressions (like having an observable being X_active+X_inactive). Sorry, this was me being a bit lazy when I wrote things up.

sebapersson commented 12 months ago

I agree with using BasicSymbolic{Real} should be used for single parameters/states.

And I agree, we can implement a PEtabDose and align it later when the PEtab standard is updated. Maybe to avoid confusion by having the user write some events in the ReactionSystems, and others as dosage, we should maybe instead use a struct like this to capture both Cont. and Discrete. events?

struct PEtabDose
    trigger::Num      
    target::Num     # The species (or parameter) that should be updated.
    value::Num      # The updated value.
    duration::Num
end

Where if trigger is a single value or parameter we assume it refers to a trigger time, but we also allow trigger to be something like S > 3 and then under the hood handles it like a continues event (I have code for this). Value can then either be a concrete value or model parameter. I think it would also be nice to allow duration - that is for time-triggered events allow the user to set a duration before a value or parameter changes back to the original value. All this (except duration) should be straightforward to implement, while maintaining correctness of model gradients.

TorkelE commented 12 months ago

That PETabDose approach seems good.

sebapersson commented 12 months ago

Great, I will try to implement it then next week :) (first have to catch up a bit on the documentation)

TorkelE commented 11 months ago

I have had some problems actually fitting a model using the new interface. I have tried various options, but all fail:

using PyCall
using Optim

callibrateModel(petab_problem, Fides(verbose=false))
callibrateModel(petab_problem, parameter_map, Fides(verbose=false))
callibrateModel(petab_problem, state_map, Fides(verbose=false))
callibrateModel(petab_problem, state_map, parameter_map, Fides(verbose=false))

callibrateModel(petab_problem, IPNewton())
callibrateModel(petab_problem, parameter_map, IPNewton())
callibrateModel(petab_problem, state_map, IPNewton())
callibrateModel(petab_problem, state_map, parameter_map, IPNewton())

What is the correct syntax to use here, @sebapersson?

sebapersson commented 11 months ago

Sorry this part has been a bit of work in progress so the syntax can be a bit unclear. The correct syntax is:

using Optim

#=
p0 = Vector{Float64} which holds the start-guess for the parameter estimation. The parameter ordering can be obtained from petab_problem.θ_estNames
=#
res = calibrateModel(petabProblem, p0, Optim.IPNewton())

Currently p0 is a Vector{Float64}, but it might be worth to also allow it to be something like a parameter-map (and then under the hood the ordering is fixed)?

You can find more examples here for how to run multi-start parameter estimation. The documentation for this will soon also be up (I will soon merge the Catalyst_integration branch with main)

TorkelE commented 11 months ago

Generally, for Catalyst (and other ModeingToolkit-based tools) the recommended workflow is to use parameter maps, either

rs = @reaction_network begin
    (p,d), 0 <--> X
end
parameter_map [:p => 3.0, :d => 1.0]

or

rs = @reaction_network begin
    (p,d), 0 <--> X
end
@unpacl p, d = rs
parameter_map [p => 3.0, d => 1.0]

If useful, I can help with implementation.

sebapersson commented 11 months ago

I see. Most optimization packages take a Vector though, e.g. Optmisation.jl as here https://docs.sciml.ai/Optimization/stable/examples/rosenbrock/ which I think is the most scalable solution if there are many (>30) parameters to estimate. But for smaller problems I do like the syntax with the parameter map better. So I think the best solution is to allow both - that if a vector is provided then in calibrateModel function the parameter map is parsed into a Vector{Float64}.

Would be grateful if you could help with the implementation

TorkelE commented 11 months ago

I agree. if you make the version that acts on Vector{Float64} I will make a PR with overloads in case maps are given instead.

sebapersson commented 11 months ago

I have now merged the Catalyst integration with main, and following #69 I have also changed the naming to better follow the Julia code-standard, as can be seen in the updated docs https://sebapersson.github.io/PEtab.jl/dev/Define_in_julia/.

Next release will be breaking (as naming changes), so before that if you (@TorkelE ) have any feedback on naming etc please tell.

The only thing left now is PEtabDose which I hopefully will have time to add next week.

TorkelE commented 11 months ago

Still traveling, but will be back tomorrow (Sunday) at which point I will make some additional tutorials, should hopefully be done quite quickly.

TorkelE commented 11 months ago

I'm going through things, and only really have two thoughts:

Single simulation experiment

There are a decent number of cases when there is only really is a single experiment (e.g. modelling epidemics, where each epidemic is distinct and only happens once). Would it make sense to create a dispatch for PEtabModel when no (or empty) simulation_conditions is provided? Under the hood, PEtabModel could then create an empty condition dictionary, give it a default tag ("__condition_0") and attach it to all measurements (and do a check so that there are no parameters unaccounted for).

Observables

I'd say in the vast majority of cases, the observable is a species of the system. In these cases, it would be pleasant to be able to designate this without having to explicitly @unpack that species. E.g. for

rn = @reaction_network begin
    (p, d), 0 <--> X
end

where X is observed, something like

obs_X = PEtabObservable(:X, 0.1)

would be useful. When PEtabModel is run, one would infer that :X is in fact the variable X(t). Even for systems with more complicated observables, e.g.

rn = @reaction_network begin
    (k1,k2), X1 <--> X2
end

where we now would run

@unpack X1, X2 = rn
obs_Xtot = PEtabObservable(X1+X2, 0.1)

my goal is that observables should instead be incorporated into the reaction system:

rn = @reaction_network begin
    @observed Xtot = X1+X2    # Not actually implemented yet, the .observed field exist though, but cannot be filled this way).
    (k1,k2), X1 <--> X2
end

and one could run

obs_Xtot = PEtabObservable(:Xtot, 0.1)

However, I the current approach is more general. For a starter, it does give the flexibility of being able to define stuff like Xtot=X1+X2 within PEtabObservable, but also (as in your example) to have more explicit formula for measurement errors and noise.

I'm not sure what is best. I can imagine that it is a bit messy to support both cases. If there is an elegant solution I think it would be worth it, however, depends on what you think is best.