Parameter estimation with two experimental conditions

aidinbii commented 1 year ago

Hi, I have a following setup:

A single model and two experimental conditions - healthy (H) and patient (P)
The model has 92 parameters and one of them is fixed. And other two are shared among H and P. So, in total the parameter vector has 180 parameters.
I want to estimate this parameter vector (size 180)
Also, the P case, takes the state values at the last time point of the H, as its initial condition.
In the end I will have an objective function = J_healthy + J_patient , which I want to optimize

Could you please, write if this approach image from here will work in my case? And in general I would appreciate any suggestions :)

PaulJonasJost commented 1 year ago

Hey,

some questions regarding this:

Should the 89 parameters that a condition specific be independent of each other? You would probably need to adjust your SBML-model to be able to switch conditions.
If P takes values from H, do you expect H to be in a steady state or is it longitudinal data? If it is time course data, you might want to have a look at petab_timecourse.
If you set up the petab problem correctly, pyPESTO should automatically generate an objective function of the kind you want to optimize.

In general I do not see a problem, why it should not work. pyPESTO can definitely optimize the objective function, and your question is more how to build the PEtab problem, which is definitely possible.

Hope I could help.

dilpath commented 1 year ago

Mostly re-iterating what @PaulJonasJost said.

re: PEtab Timecourse, you can use PEtab Timecourse to specify your problem as, e.g. "simulate with condition H for 100 time units, then use the endpoint as the initial condition for simulating with P for 100 time units, and fit the model parameters to this timecourse". A simple way to do this is, for each parameter k in the model, rewrite it like

k = k_H if H_flag == 1 else k_P

and estimate k_H and k_P. The your PEtab Timecourse sequence would be condition H (where H_flag = 1) for 100 time units, then condition P (where H_flag = 0) for the remaining 100 time units. PEtab Timecourse can then export to a SBML model with events that encode this directly, so there's now only one "condition".

There is an example of this "PEtab Timecourse -> Standard PEtab with SBML Events" [1][2]. Your result should end up looking like the last figures in this notebook [3].

If you want your model to use the steady-state from condition H as the initial condition of simulation with condition P, this is technically supported by PEtab Timecourse, but currently not really supported by any tools AFAIK. You could try to use condition H as a PEtab pre-equilibration condition, but then you could only fit "healthy parameters" to steady-state measurements under condition H.

If that wasn't helpful then more information about the data you're fitting would help to understand your situation better.

[1] https://github.com/dilpath/petab_timecourse/tree/94c61b4bb96bc5e8b707dcf1c6f95a7f343d3384/doc/examples/amici_periods_vs_events/input/simple_timecourse [2] https://github.com/dilpath/petab_timecourse/blob/94c61b4bb96bc5e8b707dcf1c6f95a7f343d3384/doc/examples/amici_periods_vs_events/simulate_as_events.py [3] https://github.com/dilpath/petab_timecourse/blob/main/doc/examples/simple_timecourse_simulate.ipynb

aidinbii commented 1 year ago

@PaulJonasJost , @dilpath , thank you very much for your reply

Should the 89 parameters that a condition specific be independent of each other? You would probably need to adjust your SBML-model to be able to switch conditions.

Yes, these 89 parameters should be independent of each other. I need to fit twice: one for the H condition and second for the P condition. So, I have data for both conditions

k = k_H if H_flag == 1 else k_P

I don't understand where to add these lines. I have a .yml file for the model definition. Then I convert it to PEtab tables for parameter estimation with yaml2sbml

If P takes values from H, do you expect H to be in a steady state or is it longitudinal data? If it is time course data, you might want to have a look at petab_timecourse.

I expect H and P to be in steady-state

PaulJonasJost commented 1 year ago

I don't understand where to add these lines

Those lines would be added in your sbml file I guess. If for example you have a reaction A -> B associated with parameter k_h and k_P - depending on the condition, you would define k as @dilpath described to encompass both conditions in a single parameter where now H_flag acts like a switch between those parameters. I am not entirely sure whether for your specific case yaml2sbml is the best option as I am not sure whether it has the ability to do this. Other options might be (Antimony)[https://tellurium.readthedocs.io/en/latest/antimony.html] or (Copasi)[https://copasi.org], antimony being close to the functionality you have with yaml2sbml.

I expect H and P to be in a steady-state

This seems rather confusing to me, as you previously mentioned that the P case, takes the state values at the last time point of the H, as its initial condition therefore unless H and P do not have the same steady state, there would be a time for P where it is not in steady state. My current guess would be the following, and please correct my if I am wrong:

You have measurements for both Healthy and Patient at one timepoint (not necessarily the same for both conditions)
The measurement is assumed to be taken in steady state
There is a transition from Measurement of H to Measurement of P

In this case your setup would be akin to

start in steady state of H
end in steady state of P

is that correct?

aidinbii commented 1 year ago

You have measurements for both Healthy and Patient at one timepoint (not necessarily the same for both conditions)

The measurement is assumed to be taken in steady state

There is a transition from Measurement of H to Measurement of P

Yes, that is right

In this case your setup would be akin to

start in steady state of H

end in steady state of P

Why start in steady state of H?

Okay, so, I need to rewrite my model definition from .yml to Antimony or Copasi

dilpath commented 1 year ago

You can achieve this with yaml2sbml, there's an example of a piecewise [1]. In your case, the yaml2sbml expression for k = k_H if H_flag == 1 else k_P could look like

assignments:
    - assignmentId: k
      formula: piecewise(k_H, H_flag == 1, k_P)

Alternatively, you could setup two PEtab experimental conditions. The first conditionH is simply k = k_H, to fit the healthy data. The second conditionP is k = k_P, to fit the patient data. In the measurements table, all data should have conditionH as the preequilibration condition ID, and conditionP as the simulation condition ID. The healthy data is at time 0, the patient data is at time inf. If you do this, you might be able to keep your YAML model unchanged.

[1] https://github.com/yaml2sbml-dev/yaml2sbml/blob/main/doc/examples/Format_Features/step_function.yml

aidinbii commented 1 year ago

@dilpath , aha I see But what about the objective function It will be by default: objective function = J_healthy + J_patient ?

Or I need to write my own obj.f.?

dilpath commented 1 year ago

Yes, the objective function (here, written as the negative log likelihood) will be constructed from the measurements table, so will look like J_healthy_1 + J_patient_1 + J_healthy_2 + J_patient_2 + ... + J_healthy_n + J_patient_n

i.e. all healthy and patient data will be in your objective function and used for optimization

Are you using AMICI? If so, AMICI has Newton's method and some methods tailored for sensitivities, to speed up simulation and gradient computation for steady-state data specifically.

plakrisenko commented 1 year ago

AMICI has Newton's method and some methods tailored for sensitivities, to speed up simulation and gradient computation for steady-state data specifically.

Yes, and, by default, the Newton's method is attempted first for steady-state computation. If it fails, numerical integration is performed instead. If it fails as well, Newton's method is performed again starting from the final point of the numerical integration.

For sensitivities computation, you may try setting amici_model.setSteadyStateSensitivityMode(amici.SteadyStateSensitivityMode.newtonOnly) to use tailored method for sensitivities.

Maybe also trying

amici_solver = amici_model.getSolver()
amici_solver.setSensitivityMethod(amici.SensitivityMethod.adjoint)

aidinbii commented 1 year ago

@dilpath , could you please have a look at the tables

aidinbii commented 1 year ago

Also, I guess I need to change the parameters table After running !petablint -v -y ./param_est_IgG4_HP_PEtab/problem_IgG4_HP.yml I get

Checking model...
Checking measurement table...
Checking condition table...
Checking observable table...
Checking parameter table...
Missing parameter(s) in the model or the parameters table: {'p13_P', 'p65_H', 'p85_H', 'p88_P', 'p53_H', 'p91_H', 'p26_H', 'p4_P', 'p68_P', 'p73_H', 'p35_P', 'p76_P', 'p71_P', 'p35_H', 'p9_H', 'p76_H', 'p64_P', 'p13_H', 'p69_H', 'p7_H', 'p31_H', 'p39_H', 'p39_P', 'p84_P', 'p63_P', 'p6_H', 'p34_P', 'p11_P', 'p48_P', 'p32_P', 'p79_P', 'p50_P', 'p61_P', 'p87_P', 'p25_H', 'p55_H', 'p51_P', 'p77_P', 'p78_H', 'p15_P', 'p29_H', 'p48_H', 'p91_P', 'p87_H', 'p29_P', 'p38_H', 'p54_H', 'p85_P', 'p56_H', 'p12_P', 'p66_H', 'p47_H', 'p2_H', 'p56_P', 'p24_P', 'p10_H', 'p54_P', 'p31_P', 'p5_H', 'p62_H', 'p43_H', 'p55_P', 'p80_P', 'p40_H', 'p36_P', 'p44_H', 'p8_P', 'p61_H', 'p83_H', 'p15_H', 'p37_P', 'p12_H', 'p70_P', 'p41_H', 'p6_P', 'p9_P', 'p62_P', 'p19_P', 'p8_H', 'p40_P', 'p72_P', 'p65_P', 'p22_P', 'p86_H', 'p23_H', 'p20_P', 'p64_H', 'p44_P', 'p57_P', 'p68_H', 'p43_P', 'p32_H', 'p21_H', 'p18_P', 'p90_H', 'p4_H', 'p52_H', 'p82_P', 'p67_H', 'p84_H', 'p51_H', 'p27_P', 'p21_P', 'p24_H', 'p53_P', 'p20_H', 'p17_P', 'p89_P', 'p18_H', 'p2_P', 'p60_H', 'p47_P', 'p73_P', 'p42_H', 'p41_P', 'p42_P', 'p38_P', 'p33_H', 'p10_P', 'p90_P', 'p23_P', 'p78_P', 'p46_P', 'p46_H', 'p49_H', 'p14_P', 'p67_P', 'p34_H', 'p59_P', 'p79_H', 'p16_H', 'p26_P', 'p82_H', 'p72_H', 'p45_P', 'p71_H', 'p57_H', 'p14_H', 'p33_P', 'p52_P', 'p16_P', 'p58_P', 'p86_P', 'p25_P', 'p83_P', 'p3_P', 'p45_H', 'p75_H', 'p70_H', 'p7_P', 'p30_P', 'p59_H', 'p80_H', 'p77_H', 'p74_P', 'p22_H', 'p88_H', 'p37_H', 'p19_H', 'p60_P', 'p3_H', 'p11_H', 'p50_H', 'p17_H', 'p75_P', 'p36_H', 'p49_P', 'p89_H', 'p81_H', 'p69_P', 'p66_P', 'p81_P', 'p63_H', 'p5_P', 'p27_H', 'p74_H', 'p30_H', 'p58_H'}
{'p38', 'p51', 'p30', 'p14', 'p31', 'p65', 'p42', 'p48', 'p63', 'p10', 'p46', 'p79', 'p2', 'p90', 'p85', 'p27', 'p37', 'p8', 'p26', 'p55', 'p18', 'p72', 'p75', 'p29', 'p20', 'p81', 'p4', 'p74', 'p58', 'p13', 'p82', 'p49', 'p88', 'p17', 'p68', 'p15', 'p3', 'p11', 'p40', 'p73', 'p76', 'p86', 'p69', 'p12', 'p78', 'p62', 'p66', 'p84', 'p21', 'p50', 'p16', 'p24', 'p83', 'p41', 'p33', 'p35', 'p59', 'p52', 'p91', 'p57', 'p89', 'p43', 'p45', 'p64', 'p19', 'p6', 'p22', 'p61', 'p60', 'p80', 'p39', 'p5', 'p70', 'p77', 'p34', 'p25', 'p36', 'p32', 'p44', 'p7', 'p54', 'Antigen', 'p67', 'p47', 'p53', 'p87', 'p9', 'p71', 'p23', 'p56'} are not allowed to occur in the parameters table.
Visualization table not available. Skipping.
Not OK

dilpath commented 1 year ago

The conditions and measurements table are what I meant :+1:

The error when linting occurs because the parameters *_H and *_P are now the estimated parameters, so remove e.g. p38 from your parameters table, and add p38_H and p38_P to your parameters table. p38 should now only exist as a column in your conditions table, and in your SBML model itself. p38_H and p38_P should only appear in your conditions table and parameters table. This maps the estimated p38_H and p38_P "PEtab parameter" values to the p38 "SBML parameter" in your model.

As an aside, I noticed the noise for your IL_33 measurement is very large (1e+100) . I would guess it doesn't contribute anything to your fitting.

aidinbii commented 1 year ago

I made the changes:

Removed common param-s from experimental_conditions table (scale_prot, scale_cell)
Modified the param-s table

!petablint -v -y ./param_est_IgG4_HP_PEtab/problem_IgG4_HP.yml

Checking model...
Checking measurement table...
Checking condition table...
Checking observable table...
Checking parameter table...
Visualization table not available. Skipping.
PEtab format check completed successfully.

importer = pypesto.petab.PetabImporter(petab_problem)

model_petab = importer.create_model(force_compile = False)

print("Model parameters:", list(model_petab.getParameterIds()), "\n")
print("Model const parameters:", list(model_petab.getFixedParameterIds()), "\n")
print("Model outputs:   ", list(model_petab.getObservableIds()), "\n")
print("Model states:    ", list(model_petab.getStateIds()), "\n")

Model parameters: ['p2', 'p3', 'p4', 'p5', 'p6', 'p7', 'p8', 'p9', 'p10', 'p11', 'p12', 'p13', 'p14', 'p15', 'p16', 'p17', 'p18', 'p19', 'p20', 'p21', 'p22', 'p23', 'p24', 'p25', 'p26', 'p27', 'scale_prot', 'p29', 'p30', 'p31', 'p32', 'p33', 'p34', 'p35', 'p36', 'p37', 'p38', 'p39', 'p40', 'p41', 'p42', 'p43', 'p44', 'p45', 'p46', 'p47', 'p48', 'p49', 'p50', 'p51', 'p52', 'p53', 'p54', 'p55', 'p56', 'p57', 'p58', 'p59', 'p60', 'p61', 'p62', 'p63', 'p64', 'p65', 'p66', 'p67', 'p68', 'p69', 'p70', 'p71', 'p72', 'p73', 'p74', 'p75', 'p76', 'p77', 'p78', 'p79', 'p80', 'p81', 'p82', 'p83', 'p84', 'p85', 'p86', 'p87', 'p88', 'p89', 'p90', 'p91', 'scale_cell', 'noiseParameter1_Naive_B_cells', 'noiseParameter1_plasma', 'noiseParameter1_Tregs', 'noiseParameter1_naive_CD4', 'noiseParameter1_pDC', 'noiseParameter1_cDC', 'noiseParameter1_NK', 'noiseParameter1_act_NK', 'noiseParameter1_Act_B_cells', 'noiseParameter1_nDC_flowC', 'noiseParameter1_CD4_T', 'noiseParameter1_IFN_g', 'noiseParameter1_IL_6', 'noiseParameter1_IL_10', 'noiseParameter1_IL_7', 'noiseParameter1_IL_15', 'noiseParameter1_GMCSF', 'noiseParameter1_IL_33', 'noiseParameter1_IL_12', 'noiseParameter1_IL_2', 'noiseParameter1_IL_4', 'noiseParameter1_IL_1', 'noiseParameter1_IgG4'] 

Model const parameters: ['Antigen'] 

Model outputs:    ['Naive_B_cells', 'plasma', 'Tregs', 'naive_CD4', 'pDC', 'cDC', 'NK', 'act_NK', 'Act_B_cells', 'nDC_flowC', 'CD4_T', 'IFN_g', 'IL_6', 'IL_10', 'IL_7', 'IL_15', 'GMCSF', 'IL_33', 'IL_12', 'IL_2', 'IL_4', 'IL_1', 'IgG4'] 

Model states:     ['mu_1', 'mu_2', 'mu_3', 'mu_4', 'mu_5', 'mu_6', 'mu_7', 'mu_8', 'mu_9', 'mu_10', 'mu_11', 'mu_12', 'mu_13', 'mu_14', 'mu_15', 'mu_16', 'mu_17', 'mu_18', 'mu_19', 'mu_20', 'mu_21', 'mu_22', 'mu_23', 'mu_24', 'mu_25', 'mu_26', 'mu_27', 'mu_28', 'mu_29', 'mu_30', 'mu_31', 'mu_32']

But, when creating the objective function:

obj = importer.create_objective(model = model_petab, force_compile=True)

I get an error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-28-755036705e72> in <module>
----> 1 obj = importer.create_objective(model = model_petab, force_compile=True)

~/miniconda3/envs/modeling/lib/python3.9/site-packages/pypesto/petab/importer.py in create_objective(self, model, solver, edatas, force_compile, **kwargs)
    422             )
    423 
--> 424         parameter_mapping = amici.petab_objective.create_parameter_mapping(
    425             petab_problem=self.petab_problem,
    426             simulation_conditions=simulation_conditions,

~/miniconda3/envs/modeling/lib/python3.9/site-packages/amici/petab_objective.py in create_parameter_mapping(petab_problem, simulation_conditions, scaled_parameters, amici_model, **parameter_mapping_kwargs)
    512         simulation_conditions.iterrows(), prelim_parameter_mapping
    513     ):
--> 514         mapping_for_condition = create_parameter_mapping_for_condition(
    515             prelim_mapping_for_condition, condition, petab_problem, amici_model
    516         )

~/miniconda3/envs/modeling/lib/python3.9/site-packages/amici/petab_objective.py in create_parameter_mapping_for_condition(parameter_mapping_for_condition, condition, petab_problem, amici_model)
    756     logger.debug("Variable parameters simulation: " f"{condition_map_sim_var}")
    757 
--> 758     petab.merge_preeq_and_sim_pars_condition(
    759         condition_map_preeq_var,
    760         condition_map_sim_var,

~/miniconda3/envs/modeling/lib/python3.9/site-packages/petab/parameter_mapping.py in merge_preeq_and_sim_pars_condition(condition_map_preeq, condition_map_sim, condition_scale_map_preeq, condition_scale_map_sim, condition)
    739                 pass
    740             else:
--> 741                 raise ValueError(
    742                     "Cannot handle different values for dynamic "
    743                     f"parameters: for condition {condition} "

ValueError: Cannot handle different values for dynamic parameters: for condition simulationConditionId          conditionP
preequilibrationConditionId    conditionH
Name: 0, dtype: object parameter p38 is p38_H for preeq and p38_P for simulation.

dilpath commented 1 year ago

Sorry for the mistake, I spoke with @dweindl , it seems that this setup is currently not supported by AMICI, because gradients for parameter estimation will only be returned for k (the model quantity), not k_H and k_P directly, so fitting k_H and k_P via conditions will not work easily when using the PEtab+pyPESTO+AMICI setup.

The only way to do this that I now see is to define the assignments

assignments:
    - assignmentId: k
      formula: piecewise(k_P, t > 0, k_H)

Since this model would take care of the change in condition, you can remove all k, k_H and k_P from your conditions table. Keep the k_H and k_P in your parameters table. Make sure that you still have some preequilibration condition in your measurements table, so that AMICI performs preequilibration, but this can be a dummy condition, if you have nothing left in the peequilibration condition after removing the ks, k_Hs and k_Ps.

aidinbii commented 1 year ago

I don't understand why the condition in the formula is t > 0. Is it because we have preequilibration condition in the measurements table at t = 0?
And how the measurements table will change?

Also, what about the fixed parameter. The assignment will be:

assignments:
- assignmentId: Antigen
  formula: piecewise(1, t > 0, 0)

dilpath commented 1 year ago

re: t > 0, yes, but it seems this also won't work, because t > 0 will get triggered during preequilibration too it seems, sorry. Furthermore, to avoid possible numerical issues with piecewise, I would change my suggestion: use a preequilibration indicator parameter, that is 1 in the preequilibration condition, and 0 in the simulation condition.

Then define any parameters that change between conditions like this (fixed or estimated):

assignments:
    - assignmentId: k
      formula: preequilibration_indicator * k_H + (1 - preequilibration_indicator) * k_P

With this, you won't need to change your measurements table: still use the preequilibration and simulation condition there, with time 0 for healthy, and inf for patient.

aidinbii commented 1 year ago

@dilpath , I modified the model definition modelDef_IgG4_counts_healthy_patient.txt

but when !petablint -v -y ./param_est_IgG4_HP_PEtab/problem_IgG4_HP.yml I get an error:

Checking model...
libSBML Error (SBML identifier consistency): The value of the 'id' field on every instance of the following type of object in a model must be unique: <model>, <functionDefinition>, <compartmentType>, <compartment>, <speciesType>, <species>, <reaction>, <speciesReference>, <modifierSpeciesReference>, <event>, and model-wide <parameter>s. Note that <unitDefinition> and parameters defined inside a reaction are treated separately. 
Reference: L3V1 Section 3.3
   The <parameter> id 'Antigen' conflicts with the previously defined <parameter> id 'Antigen' at line 44.

libSBML Error (SBML identifier consistency): The value of the 'id' field on every instance of the following type of object in a model must be unique: <model>, <functionDefinition>, <compartmentType>, <compartment>, <speciesType>, <species>, <reaction>, <speciesReference>, <modifierSpeciesReference>, <event>, and model-wide <parameter>s. Note that <unitDefinition> and parameters defined inside a reaction are treated separately. 
Reference: L3V1 Section 3.3
   The <parameter> id 'p2' conflicts with the previously defined <parameter> id 'p2' at line 45.

libSBML Error (SBML identifier consistency): The value of the 'id' field on every instance of the following type of object in a model must be unique: <model>, <functionDefinition>, <compartmentType>, <compartment>, <speciesType>, <species>, <reaction>, <speciesReference>, <modifierSpeciesReference>, <event>, and model-wide <parameter>s. Note that <unitDefinition> and parameters defined inside a reaction are treated separately. 
Reference: L3V1 Section 3.3
   The <parameter> id 'p3' conflicts with the previously defined <parameter> id 'p3' at line 46.
...
...

WARNING: Generated invalid SBML model. Check messages above.
Checking measurement table...
Checking condition table...
Checking observable table...
Checking parameter table...
Extraneous parameter(s) in parameter table: {'p33', 'p37', 'p4', 'p8', 'p78', 'p67', 'p44', 'p6', 'p38', 'p52', 'p49', 'p12', 'p62', 'p55', 'p66', 'p3', 'p11', 'preequilibration_indicator', 'p80', 'p61', 'p81', 'p56', 'p72', 'p90', 'Antigen', 'p75', 'p18', 'p63', 'p7', 'p76', 'p89', 'p84', 'p57', 'p13', 'p48', 'p35', 'p77', 'p83', 'p64', 'p5', 'p23', 'p50', 'p45', 'p30', 'p51', 'p58', 'p24', 'p65', 'p10', 'p59', 'p68', 'p79', 'p25', 'p87', 'p36', 'p82', 'p86', 'p14', 'p19', 'p60', 'p22', 'p43', 'p31', 'p9', 'p88', 'p70', 'p34', 'p41', 'p74', 'p47', 'p71', 'p20', 'p39', 'p21', 'p2', 'p91', 'p26', 'p17', 'p16', 'p42', 'p46', 'p27', 'p54', 'p29', 'p32', 'p73', 'p69', 'p40', 'p15', 'p53', 'p85'}
{'p33', 'p37', 'p67', 'p6', 'p38', 'p52', 'p49', 'p62', 'p55', 'p3', 'p61', 'p81', 'p56', 'p90', 'Antigen', 'p63', 'p89', 'p84', 'p57', 'p13', 'p35', 'p77', 'p83', 'p64', 'p5', 'p23', 'p50', 'p30', 'p58', 'p65', 'p10', 'p25', 'p87', 'p36', 'p82', 'p14', 'p43', 'p70', 'p41', 'p74', 'p71', 'p20', 'p2', 'p16', 'p42', 'p46', 'p27', 'p73', 'p40', 'p15', 'p53', 'p4', 'p8', 'p78', 'p44', 'p12', 'p66', 'p11', 'preequilibration_indicator', 'p80', 'p72', 'p75', 'p18', 'p7', 'p76', 'p48', 'p45', 'p51', 'p24', 'p59', 'p68', 'p79', 'p86', 'p19', 'p60', 'p22', 'p31', 'p9', 'p88', 'p34', 'p47', 'p39', 'p21', 'p91', 'p26', 'p17', 'p54', 'p29', 'p32', 'p69', 'p85'} are not allowed to occur in the parameters table.
Visualization table not available. Skipping.
Not OK

dilpath commented 1 year ago

The error is that the current model definition has entities defined twice, as both an assignment and a parameter. Given your current assignments, which are of the form

assignments:
    - assignmentId: k
      formula: preequilibration_indicator * k_H + (1 - preequilibration_indicator) * k_P

you should then remove k from your parameters: definitions, and make parameters: definitions for k_Hand k_P. In the case of Antigen, simply remove it from your parameters: block.

I think preequilibration_indicator can also be removed from your parameters: block, since it is defined in your conditions.

aidinbii commented 1 year ago

@dilpath , I made the changes modelDef_IgG4_counts_HP.txt

but I get an error:

Checking model...
libSBML Error (MathML consistency): Outside of a <functionDefinition>, if a <ci> element is not the first element within a MathML <apply>, then the <ci>'s value can only be chosen from the set of identifiers of (in L2V1) <species>, <compartment>, or <parameter> objects; (in L2V2-L2V5), <species>, <compartment>, <parameter> or <reaction> objects; (in L3V1) <species>, <compartment>, <parameter>, <reaction> or <speciesReference> objects and (in L3V2) <species>, <compartment>, <parameter>, <reaction>, <speciesReference> objects or L3 package objects with defined mathematical meaning. 
Reference: L3V1 Section 3.4.3
 The formula 'preequilibration_indicator' in the math element of the <assignmentRule> uses 'preequilibration_indicator' that is not the id of a species/compartment/parameter/reaction/speciesReference.

libSBML Error (MathML consistency): Outside of a <functionDefinition>, if a <ci> element is not the first element within a MathML <apply>, then the <ci>'s value can only be chosen from the set of identifiers of (in L2V1) <species>, <compartment>, or <parameter> objects; (in L2V2-L2V5), <species>, <compartment>, <parameter> or <reaction> objects; (in L3V1) <species>, <compartment>, <parameter>, <reaction> or <speciesReference> objects and (in L3V2) <species>, <compartment>, <parameter>, <reaction>, <speciesReference> objects or L3 package objects with defined mathematical meaning. 
Reference: L3V1 Section 3.4.3
 The formula 'preequilibration_indicator' in the math element of the <assignmentRule> uses 'preequilibration_indicator' that is not the id of a species/compartment/parameter/reaction/speciesReference.

libSBML Error (MathML consistency): Outside of a <functionDefinition>, if a <ci> element is not the first element within a MathML <apply>, then the <ci>'s value can only be chosen from the set of identifiers of (in L2V1) <species>, <compartment>, or <parameter> objects; (in L2V2-L2V5), <species>, <compartment>, <parameter> or <reaction> objects; (in L3V1) <species>, <compartment>, <parameter>, <reaction> or <speciesReference> objects and (in L3V2) <species>, <compartment>, <parameter>, <reaction>, <speciesReference> objects or L3 package objects with defined mathematical meaning. 
Reference: L3V1 Section 3.4.3
 The formula 'preequilibration_indicator' in the math element of the <assignmentRule> uses 'preequilibration_indicator' that is not the id of a species/compartment/parameter/reaction/speciesReference.
...

...
WARNING: Generated invalid SBML model. Check messages above.
Checking measurement table...
Checking condition table...
Condition table contains column for unknown entity 'preequilibration_indicator'.
Checking observable table...
Checking parameter table...
Visualization table not available. Skipping.
Not OK

If I add preequilibration_indicator to the parameters: I get another error

Checking model...
Checking measurement table...
Checking condition table...
Checking observable table...
Checking parameter table...
Extraneous parameter(s) in parameter table: {'preequilibration_indicator'}
{'preequilibration_indicator'} is not allowed to occur in the parameters table.
Visualization table not available. Skipping.
Not OK

dilpath commented 1 year ago

This looks like a shortcoming of yaml2sbml, feel free to open an issue there. Until it's fixed, I would suggest manually deleting the row for preequilibration_indicator from your PEtab parameters table (but keep it in the yaml2sbml file as-is, since it still needs to exist as a parameter in the SBML).

aidinbii commented 1 year ago

I want to verify that the P condition indeed, takes the state values at the last time point (or steady state) of the H condition, as its initial condition. How can I do this?
Also, could you please explain this figure of the model fit
How can I make separate figures of the fit for H and P conditions ?

dilpath commented 1 year ago

you could simulate the model at additional time points between the initial timepoint and the steady-state "timepoint" and plot them. It looks like there is an in-built pyPESTO function for this, time_trajectory_model [1], but I haven't used it. Let me know how it goes. Just increase the timespan you plot until you see the endpoints approaching the "P" values. n_timepoints doesn't make much sense I think, since your last time point is inf, so I would suggest timepoints.
one plot per observable. the H are at t=0, the P are at t=t_inf ($t_\infty$). The vertical dotted lines split the x-axis to indicate the "infinite time" between timepoint 0 and the steady-state timepoint. Despite the legend, doesn't look like the measurements were plotted, or maybe they overlap the simulation.
as I recall, for all measurements, your preequilibration condition is conditionH, and your simulation condition is conditionP, right? Then, the easiest thing to do to plot H and P separately is to change the simulationConditionId for all H to conditionH. Use this only for plotting.

[1] https://pypesto.readthedocs.io/en/latest/api/pypesto.visualize.model_fit.html#pypesto.visualize.model_fit.time_trajectory_model

aidinbii commented 11 months ago

Thank you for your suggestions. I did it differently albeit not the most pretty way:

I created a separate model with only one condition (H or P). Then split the estimated 180 vector into best_H and best_P with size 91 each.
Then during forward simulation took the end state values of H as initial condition for P

But I have another issue (The parameters I provide are the best parameters found from the param.estimation routine)

# Call postequilibration by setting an infinity timepoint
model_H.setTimepoints(np.full(1, np.inf))
#model_H.setSteadyStateSensitivityMode(amici.SteadyStateSensitivityMode.integrateIfNewtonFails)

solver = model_H.getSolver()
solver.setNewtonMaxSteps(1000)

rdata_st_state_H = amici.runAmiciSimulation(model_H, solver)

for key, value in rdata_st_state_H.items():
    print("%12s: " % key, value)

 ts:  [inf]
           x:  [[ 1.28464451e+01  4.30879099e-10 -1.25331491e-15  1.16150632e+01
   4.80590879e-04  4.91570402e+03  4.80590879e-04  4.73050518e-11
   5.33387958e+05  2.87488276e+06  4.80590879e-04  2.86841876e-13
   4.13368357e+04 -5.43974242e-10 -4.27173717e-06 -7.85828899e-16
  -3.18141019e-14 -1.18190208e-16  1.50279715e-19 -3.20390685e-14
   1.15278242e-16  0.00000000e+00  9.38055481e-03  1.95137977e-24
   7.02182992e-21  6.94443460e-26 -3.21753046e-30 -1.61215479e-24
  -5.46361562e-25  1.88566923e-20  4.76302092e-31  7.91259193e-10]]
          x0:  [1.22675903e+03 1.22665899e+03 3.45603591e+05 1.14902588e+03
 2.17635389e+03 2.58833644e+04 2.17633737e+03 6.19214559e+05
 6.56187121e+01 8.02833792e+05 1.77989142e+03 7.76544662e+03
 1.05618695e+05 2.24220102e+02 1.29511857e+03 1.85771293e+01
 4.37219022e-02 1.85772908e+01 5.49443513e+04 1.48545359e+03
 2.07662467e+01 4.26891595e+03 1.94869201e+01 4.12986464e-03
 1.85593643e+08 8.59499908e+03 7.40365449e+01 1.34639469e+03
 1.36412399e+03 2.60460426e+02 5.80985945e+02 4.35814249e+01]
        x_ss:  [nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan
 nan nan nan nan nan nan nan nan nan nan nan nan nan nan]
          sx:  None
         sx0:  None
       sx_ss:  None
           y:  [[-7.18937433e-27  8.40910692e-23 -5.27067039e-19  1.84340851e+02
   5.17971584e-02  1.92149734e-12  3.09685771e-28 -1.43485173e-32
  -2.43648924e-27  5.72884835e-02  4.18324075e-05  1.77120729e-33
   1.23995182e-09 -8.08163006e-27  7.25168989e-07  1.34543367e-07
  -3.16139885e-28  1.21225675e-16  1.19323672e-23 -1.07751571e-18
  -8.02488383e-27  7.23538490e-26  1.99589576e-22]]
posteq_status:  [[1 0 0]]

As you see I get very small negative values for states and observables. But their meaning are counts, so they should be >= 0. And when I feed them to the P model I have the same situation with < 0 values. I guess it's a numerical instability?

dilpath commented 11 months ago

I guess it's just that the parameters from pyPESTO are on parameter scale. You can unscale them with e.g.

parameter_dict = dict(zip(
    pypesto_problem.x_names,
    pypesto_result.optimize_result.list[0].x,
))
unscaled_dict = petab_problem.unscale_parameters(parameter_dict)

# simulate with `unscaled_dict` values

You should then check that you get the same values from your simulation code, as you do via pyPESTO. You can extract the pyPESTO values with e.g.

result = pypesto.objective(
    pypesto_result.optimize_result.list[0].x[pypesto_problem.x_free_indices],
    sensi_orders=(0,1),
    return_dict=True,
)
simulation_df = amici.petab_objective.rdatas_to_simulation_df(
    rdatas=result['rdatas'],
    model=pypesto_problem.objective.amici_model,
    measurement_df=petab_problem.measurement_df,
)

N.B.: I assume you're doing gradient-based optimization with 1st order sensitivities (the default with the pyPESTO AMICI workflow). If not, then adjust sensi_orders accordingly. The sensitivities should not change simulation results significantly, but using the same setting for optimization and simulation removes one possible source of error.

https://petab.readthedocs.io/projects/libpetab-python/en/latest/build/_autosummary/petab.problem.html#petab.problem.Problem.unscale_parameters

https://amici.readthedocs.io/en/latest/generated/amici.petab_objective.html#amici.petab_objective.rdatas_to_simulation_df

aidinbii commented 11 months ago

Yes, during forward simulation I'm unscaling them (from log to lin)
I verified that the results - obj.f. value, simulation values of observables - are the same when using pyPESTO and my simulation routine.

rdata_st_state_H and rdata_st_state_P are calc. for steady-state from my last comment above.

sum((((df_measure_H['measurement'] - rdata_st_state_H.y[0])) / (df_measure_H['noiseParameters'] )) ** 2 / 2 + np.log(2 * 3.14 * df_measure_H['noiseParameters'] ** 2) / 2 ) 123.87209058689442

sum((((df_measure_P['measurement'] - rdata_st_state_P.y[0])) / (df_measure_P['noiseParameters'] )) ** 2 / 2 + np.log(2 * 3.14 * df_measure_P['noiseParameters'] ** 2) / 2 ) -90.91545419046182
```
`123 - 90 = 33`
```

model_HP.setSteadyStateSensitivityMode(amici.SteadyStateSensitivityMode.integrateIfNewtonFails)

solver_HP = model_HP.getSolver()
solver_HP.setNewtonMaxSteps(5000)
obj_HP = importer_HP.create_objective(model = model_HP, solver=solver_HP, force_compile=True)

obj_HP.get_fval(best_par_HP)
32.96829937280501

result_obj = obj_HP(
    result.optimize_result.list[0].x[result.problem.x_free_indices],
    sensi_orders=(0,1),
    return_dict=True,
)

result_obj['fval']
32.96829937280501

From

simulation_df = amici.petab_objective.rdatas_to_simulation_df(
rdatas=result_obj['rdatas'],
model=model_HP,
measurement_df=df_measure,
)

I get the same values (one nan because there is a zero)


# compare simulation results for obs-s from my own simulation and PyPesto
print('Healthy \n', simulation_df.loc[0:22,'simulation']  / rdata_st_state_H.y[0])
print('Patient \n', simulation_df.loc[23:,'simulation']  / rdata_st_state_P.y[0])

Healthy [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0] Patient [nan, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]



So, it means that I do get small negative values for states and observables even during the parameter estimation procedure.

> N.B.: I assume you're doing gradient-based optimization with 1st order sensitivities (the default with the pyPESTO AMICI workflow). If not, then adjust sensi_orders accordingly. The sensitivities should not change simulation results significantly, but using the same setting for optimization and simulation removes one possible source of error.

Yes, that's right.
Here is the file I run for param.est. 
[param_est_IgG4_HP.txt](https://github.com/ICB-DCM/pyPESTO/files/13611281/param_est_IgG4_HP.txt)

Add: If I use finite timepoint (e.g. 1500) and not a steady-state calculation, then everything is `>=0`

aidinbii commented 11 months ago

I want to try instead of calc. stead-state, take three time points [500, 1500, 3000] and make three measurement replicates for every observable. And fit to them.

The same setup will work, right? I just need to change the measurement and conditions tables?

dilpath commented 11 months ago

Thanks, overall I agree, probably numerical noise. You might be able to avoid this by making the simulation or steady-state tolerances more strict.

There is also an option to ensure all states are nonnegative in AMICI. e.g. before optimization on line 142, add the line

problem.objective.amici_model.setAllStatesNonNegative()

You can do the same in your simulation script with e.g. model_HP.setAllStatesNonNegative()

dilpath commented 11 months ago

I want to try instead of calc. stead-state, take three time points [500, 1500, 3000] and make three measurement replicates for every observable. And fit to them.

The same setup will work, right? I just need to change the measurement and conditions tables?

Which setup are you currently using? If you use preequilibration and timepoint 0 for H, you will need to remove the preequilibration condition and instead use the simulation condition, such that non-steady-state H measurements can be specified. Apart from that, I think this setup can work.

aidinbii commented 11 months ago

Which setup are you currently using? If you use preequilibration and timepoint 0 for H, you will need to remove the preequilibration condition and instead use the simulation condition, such that non-steady-state H measurements can be specified. Apart from that, I think this setup can work.

Yes, that one Could you please have a look at the modified measurements table measurement_table_HP_nmol_nonneg_finite.csv
Also, will this be satisfied?

Also, the P case, takes the state values at the last time point of the H, as its initial condition.

dilpath commented 11 months ago

* Yes, that one
  Could you please have a look at the modified measurements table

Overall, looks like it will work for H. The observableId column looks unusual to me -- are these IDs that are defined in your PEtab observables table, or IDs from your model directly? The observable IDs should exist in the PEtab observables table, and should not exist in the SBML model.

* Also, will this be satisfied?
Also, the P case, takes the state values at the last time point of the H, as its initial condition.

It will not work currently. Do you really want the last timepoint (i.e., conditionH simulated for 3000 time units, then simulate with conditionP) or, the steady-state (i.e., conditionH simulated until steady-state, then simulate with conditionP)? The latter is easy: add preequilibrationConditionId=conditionH for the P measurements. The former is harder, here are some options, but they will take a bit of implementation effort I think:

setup AMICI to perform presimulation with conditionH for 3000 time units [1]
add an event to your model that switches the condition from conditionH to conditionP at 3000 time units
simulate with conditionH only, use the state at 3000 time units as the initial condition for the conditionP simulation, by manually working with AMICI runAmiciSimulation and edatas
use PEtab Timecourse, which is still under development but can already be used with your problem [2]

[1] https://amici.readthedocs.io/en/latest/ExampleExperimentalConditions.html [2] https://github.com/dilpath/petab_timecourse

aidinbii commented 11 months ago

Overall, looks like it will work for H. The observableId column looks unusual to me -- are these IDs that are defined in your PEtab observables table, or IDs from your model directly? The observable IDs should exist in the PEtab observables table, and should not exist in the SBML model.

Both. The model definition that I use is here https://github.com/ICB-DCM/pyPESTO/issues/1138#issuecomment-1818207761

yaml_file_petab = './model_definition/modelDef_IgG4_counts_HP.yml'
PEtab_dir = './param_est_IgG4_HP_PEtab'
PEtab_yaml_name = 'problem_IgG4_HP.yml'
measurement_table_name = 'measurement_table_HP_nmol.tsv'
model_name = 'modelDef_IgG4_counts_HP'

yaml2sbml.yaml2petab(yaml_file_petab,
                     PEtab_dir, 
                     model_name, 
                     PEtab_yaml_name, 
                     measurement_table_name)

Why can't I have them defined in the SBML model?

Do you really want the last timepoint (i.e., conditionH simulated for 3000 time units, then simulate with conditionP) or, the steady-state (i.e., conditionH simulated until steady-state, then simulate with conditionP)?

As I wrote before there are issues with <0 state values. But, when I simulate for some finite time(e.g. 3000), I don't encounter such issues. So, I want to try to fit without steady-state.

The latter is easy: add preequilibrationConditionId=conditionH for the P measurements.

You mean the way I did before? https://github.com/ICB-DCM/pyPESTO/issues/1138#issuecomment-1812344623

dilpath commented 11 months ago

Why can't I have them defined in the SBML model?

Hm, you're right, it's not explicitly excluded [1], but it looks like bad practice to me. In principle, they represent different things, and might cause problems in some tools due to name collisions. If it's worked so far, then nevermind.

As I wrote before there are issues with <0 state values. But, when I simulate for some finite time(e.g. 3000), I don't encounter such issues. So, I want to try to fit without steady-state.

Ah, did you already try adjusting tolerances (make them more strict) or the setAllStatesNonNegative feature? These would be the easiest things to try first I think.

* > The latter is easy: add preequilibrationConditionId=conditionH for the P measurements.

You mean the way I did before?

Nevermind, this would only work if you want to simulate to steady-state. But yes, as you did before.

If you want to stick to simulating to 3000 time units, I think the easiest option currently might be to use PEtab Timecourse to integrate the change in condition from conditionH to conditionP at 3000 time units, as an event in the model. Then your timepoints would be in [0, 3000] for H measurements, and [3000, 6000] for the P measurements, both simulated with some dummy condition ID. This should not be too much effort to setup.

There is a script [2] to convert a simple PEtab problem with a timecourse [3] into a PEtab problem with SBML events. Your `timecourse.tsv` will look like	timecourseId	timecourse
timecourse1	0:conditionH;3000:conditionP

I'm happy to provide more help with this.

[1] https://petab.readthedocs.io/en/latest/documentation_data_format.html#id2 [2] https://github.com/dilpath/petab_timecourse/blob/94c61b4bb96bc5e8b707dcf1c6f95a7f343d3384/doc/examples/amici_periods_vs_events/simulate_as_events.py [3] https://github.com/dilpath/petab_timecourse/blob/94c61b4bb96bc5e8b707dcf1c6f95a7f343d3384/doc/examples/amici_periods_vs_events/input/simple_timecourse/timecourse.tsv

aidinbii commented 11 months ago

Ah, did you already try adjusting tolerances (make them more strict) or the setAllStatesNonNegative feature? These would be the easiest things to try first I think.

Yep, I'm trying this at the moment. In parallel want to try without steady-state

aidinbii commented 11 months ago

By making the tolerances more strict

solver.setRelativeTolerance(1e-14)
solver.setAbsoluteTolerance(1e-20)

solver.setRelativeToleranceSteadyState(1e-14)
solver.setAbsoluteToleranceSteadyState(1e-20)

solver.setRelativeToleranceSteadyStateSensi(1e-12)
solver.setAbsoluteToleranceSteadyStateSensi(1e-16)

Most of the states become zero

 ts:  [inf]
           x:  [[3.34407887e+02 3.34395690e+02 0.00000000e+00 3.33579337e+02
  6.21293607e-06 3.19180297e+05 6.21293607e-06 1.23482664e-02
  1.99183518e+06 6.69813011e+01 6.21293607e-06 6.42189103e+00
  0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
  0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
  0.00000000e+00 0.00000000e+00 1.30188206e-01 0.00000000e+00
  0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
  0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00]]
          x0:  [1.22675903e+03 1.22665899e+03 3.45603591e+05 1.14902588e+03
 2.17635389e+03 2.58833644e+04 2.17633737e+03 6.19214559e+05
 6.56187121e+01 8.02833792e+05 1.77989142e+03 7.76544662e+03
 1.05618695e+05 2.24220102e+02 1.29511857e+03 1.85771293e+01
 4.37219022e-02 1.85772908e+01 5.49443513e+04 1.48545359e+03
 2.07662467e+01 4.26891595e+03 1.94869201e+01 4.12986464e-03
 1.85593643e+08 8.59499908e+03 7.40365449e+01 1.34639469e+03
 1.36412399e+03 2.60460426e+02 5.80985945e+02 4.35814249e+01]

Same happens if I set model_HP.setAllStatesNonNegative()

dilpath commented 11 months ago

Are you reporting this because the zeroes are an issue? I'm unfamiliar with your model, so I can't comment on whether its output is reasonable. But, this looks good to me, since it has resolved the negative values.

If you get very similar results from either stricter tolerances or setAllStatesNonNegative, then I think setAllStatesNonNegative might be the better option, since the stricter tolerances slow down simulation.

aidinbii commented 11 months ago

Regarding the timecourse setup. I run param.est. with this script. But I think I did it wrong. Could you please have a look. I've changed only the measurements_table and added timecourse.tsv

dilpath commented 11 months ago

I don't see an immediate issue. You can visualize the timecourse trajectory of the optimized fit with pyPESTO's time_trajectory_model method [1]. Add enough timepoints so that the plot looks dense/smooth near t=3000 -- does it look like the condition changes at t=3000 in the plot?

[1] https://pypesto.readthedocs.io/en/latest/api/pypesto.visualize.model_fit.html#pypesto.visualize.model_fit.time_trajectory_model

aidinbii commented 11 months ago

It seems to be working Screenshot_20231213_175401 Screenshot_20231213_175452

I want to have three measurement replicates

take three time points for H [500, 1500, 3000] and make three measurement replicates for every observable. And fit to them. For P [3500, 4500, 6000]

For this I changed the measurement table like this measurement_table_HP_nmol_nonneg_finite.csv

timecourse.tsv will be the same?

dilpath commented 11 months ago

For this I changed the measurement table like this measurement_table_HP_nmol_nonneg_finite.csv

timecourse.tsv will be the same?

Ah, add an empty row to your conditions table for the conditionId timecourse1, and change all measurements to have this as their simulationConditionId.

e.g. in this example there is an empty row for timecourse1 [1], and all measurements have this dummy condition [2].

[1] https://github.com/dilpath/petab_timecourse/blob/94c61b4bb96bc5e8b707dcf1c6f95a7f343d3384/doc/examples/amici_periods_vs_events/input/simple_timecourse/conditions.tsv#L3 [2] https://github.com/dilpath/petab_timecourse/blob/94c61b4bb96bc5e8b707dcf1c6f95a7f343d3384/doc/examples/amici_periods_vs_events/input/simple_timecourse/measurements.tsv

aidinbii commented 11 months ago

It didn't work. To test it I run for n_start=500 the param.est. It fails to converge after about t=3000. Screenshot_20231214_113308 Screenshot_20231214_113336

Here are the tables:

aidinbii commented 11 months ago

Oh, sorry if increase to n_timepoints=10000 and solver.setMaxSteps(50000) it converges. Screenshot_20231214_114418

It seems to be working but still how can I verify that it actually does what it supposed to do Also, during param.est. only 93 starts converged to a finite value

number of starts: 500 
* best value: 658.6733405684504, id=402
* worst value: inf, id=99
* number of non-finite values: 407

dilpath commented 11 months ago

It seems to be working but still how can I verify that it actually does what it supposed to do

What are you trying to verify? That the solution has reached a steady-state near t=3000, before switching to the next condition? You could use the time_trajectory_model like you did before, or simulate with output near t=3000 and check that some norm on the ODE right-hand-side is small. In general though, you cannot guarantee that the system has reached steady-state by t=3000, and instead should use the previous setup (preequilibration), with e.g. setAllStatesNonNegative.

Also, during param.est. only 93 starts converged to a finite value

This is problem-specific, I guess there are parameter combinations in your bounds that cannot be simulated successfully. I have seen worse than 93/500 "successful" fits, but you can still can get a good result.

ICB-DCM / pyPESTO

Parameter estimation with two experimental conditions #1138