PEtab-dev / PEtab

PEtab - an SBML and TSV based data format for parameter estimation problems in systems biology
https://petab.readthedocs.io
MIT License
56 stars 12 forks source link

PEtab extension for NLME #566

Open paulflang opened 11 months ago

paulflang commented 11 months ago

Which problem would you like to address? Please describe. Currently, PEtab does not allow to express nonlinear mixed effects models (i.e containing fixed (population) and random (patient-specific) effects). These models are very common to represent inter-patient variability in pharmacological models. IIUC, such models are often represented in software like NONMEM, which was developed in the late 70s. I believe the field could benefit from a software independent model storage and exchange format.

Describe the solution you would like In this spreadsheet, I have summarised a suggestion for a PEtab extension to introduce random effects in addition to the current format, which only allows to express fixed effects (happy to jump on a brief call to discuss in person).

Describe alternatives you have considered There is a format called PharmML (Fig 2 here), which can express mixed effects, too. However, I don't think it is widely used yet, perhaps because the XML format is hard for humans to read and understand.

Additional context I have no experience with NLME. While I have briefly discussed the suggestion linked in the spreadsheet with NLME experts, it would be good to get broader insight into a variety of use cases.

eraimundez commented 10 months ago

Hi everyone,

I think this would be a great idea! I was also thinking that this could be a very nice feature to have. And I agree with @paulflang that what is mainly missing is the possibility of defining inter-patient variability (random effects) parameters.

Regarding experience with NLME: If this is upvoted, I willing to collect ideas from my expert colleagues (and also myself πŸ˜„, although I am not an expert yet) on how to best design the extension.

However, what I see here as a potential obstacle is actually how "useful" this extension would be, i.e. whether the tool developers (e.g. NONMEM, Monolix, nlmixr, Pumas, etc) would be willing to support the NLME-PEtab format as a direct input. Maybe this is something that should also be noted when deciding whether to move forward with this idea.

What are your thoughts?

paulflang commented 10 months ago

If upvoted, support from you and your colleagues would indeed be great, @eraimundez .

Regarding usefulness, I would distinguish between intrinsic and realized usefulness. I know that pharma is a rather static field, but if we believe a PEtab extension has intrinsic usefulness, it is quite likely that at least one existing PEtab compliant tool (pyPesto, PEtab.jl, Pumas(QSP), etc.) will take it up. And that might start a process, where the other tools don't want to be left behind and the intrinsic usefulness gets realized.

eraimundez commented 10 months ago

Hi @paulflang I am bringing up here the discussion started on this spreadsheet, regarding the potential idea of having a covariates table. I think it is easier to keep on track the ideas here πŸ˜„

Indeed, the covariate mathematical expression will be part of the mathematical model. And, indeed, if there is any condition-specific covariate the place to code this would be the condition table.

When I was suggesting a covariates table, I had in mind that, (at least) in the pharmacometrics field, it is quite common to

  1. need to test several covariates to find those that are significant,
  2. have different possible mathematical ways to define the relationship,
  3. may have even, e.g., sex specific trafos
  4. may adapt the e.g. basal/average value that is used in the trafo e.g. 70kg of body weigth

So, I can see this could become a "painful" situation if every time one would need to modify directly the SBML or condition table.

Of course, this is already trying to go one step further than having a way to store the model (which then would be fine as it is already, only including the IIV extension as mentioned in this thread). To store the model, just the final version with the finally selected model structure, noise model and covariates would need to be stored. And the covariates could be just part of the SBML and condition table as you pointed out.

But I think, if it is decided that the effort will be invested, going a step further and having a flexible framework already since the beginning that could be parsed by existing tools would be optimal.

Quickly, I could imagine something like the observables table:

covariateId covariateFormula conditionId parameterId estimate  # as an example, to be further refined

which could be parsed by already existing engines such as in Pumas or PsN (see Stepwise Covariate Model-building).

FFroehlich commented 10 months ago

The PharmML probably covers most relevant usecases, so it's probably a good idea to have this as a starting point for the things that need to be covered. I agree that having something more human readable and, hence, integration with PEtab makes sense.

I like the idea of extending parameter + condition tables with noise models + covariates. For the random effect covariance structure, there are some better strategies for parametrisation (see https://doi.org/10.1016/j.celrep.2021.109507 and maybe Pauls Thesis?). Regarding encoding, maybe consider how this was done in PharmML and try to come up with a human readable variant?

eraimundez commented 10 months ago

Hi @paulflang @FFroehlich A quick follow up: I was just trying to have a look at the suggested PharmML format, however I am afraid this is no longer maintained (at all) ... Their website(s) are disabled (http://repository.ddmore.eu/ and http://ddmore.eu/pharmml ), and even redirecting to some spam site (http://www.pharmml.org/). So, I am not really sure πŸ˜…

paulflang commented 10 months ago

Interesting. I was not aware. But it should still be possible to reuse some of their ideas described in their paper, their PAGE poster or GH repo (I haven't had the opportunity to deeply look into any of that myself yet).

dweindl commented 10 months ago

I am not (yet) working on NLME and won't be able to contribute much, but I'd be happy to see it covered by PEtab.

regarding PharmML: It seems https://github.com/pharmml/pharmml-spec contains specifications for up to version 0.4. This doesn't seem to be latest version, since I found references to a PharmML version 0.9 elsewhere (without specification). Maybe one of the original PharmML authors can provide more recent information...

dilpath commented 9 months ago

Just a heads up, we will have a 1 hr PEtab breakout session at COMBINE 2023 in a few weeks. It's great to see this discussion here, I'm also interested in this particular extension, and I'm keen to discuss this in the breakout session. I'll try to create a hybrid event to support a virtual audience via Zoom, if possible. I'll share the Zoom details via the community mailing list, when more details about the event are known.

https://groups.google.com/u/1/g/petab-discuss

rikardn commented 9 months ago

I was one of the developers of tooling around the PharmML and currently a developer of the pharmacometrics Python package https://github.com/pharmpy/pharmpy and maintainer of PsN (mentioned earlier). Pharmpy has a json format for nlme-models and also for dataset metadata. I would be happy to learn more about your work and discuss options so I signed myself up on the doodle for the 9th.

eraimundez commented 9 months ago

Hi @rikardn Thank you so much for reaching out! This is great to have you also on board for this initial discussion meeting.

Anyone else: Here is the link to the doodle https://doodle.com/meeting/participate/id/bWqxJnge It will be active until this Friday 22/09/2023.

matthiaskoenig commented 8 months ago

Is there a date/time/link for the meeting?

dweindl commented 8 months ago

Is there a date/time/link for the meeting?

You got mail.

rikardn commented 8 months ago

Thanks all for the very interesting discussion! After getting to know the PEtab format a bit better I just wanted to share my thoughts.

In pharmacometrics much of what is in the PEtab would be considered being part of the model. The experimental conditions, the expression for the observations and noise (called the error model in pharmacometrics) and the parameters. Also the dataset, although not regarded as a part of the model, is very tightly linked to the model. Most often a model is developed for one dataset (experiment or trial) and will never see any other datasets (apart from slight modifications of the dataset, for example exclusion of outliers). This reduces the need for a file format such as PEtab for models in pharmacometrics. To me, correct me if I am wrong, it seems as if an SBML model doesn't have the same tight coupling to one single dataset and is designed to allow multiple experiments for the same model.

matthiaskoenig commented 8 months ago

Yes, we often have multiple datasets and also in the context of PBPK models we often use the data from many different studies. The idea is to have a general model which can be applied to multiple datasets, instead of having a model for a single dataset (but the latter also happens quit often). As a consequence it makes sense to split things up into model equations (SBML), optimization (PETab), simulations (dosing protocols), and the information about the priors/non-linear mixed effects (these could also vary depending on the questions/datasets). Perfect separation is not possible but there are different building blocks instead of everything combined for a single dataset. You want to optimize a certain subset of data with a given model structure and priors/NLME for a subset of simulation experiments/conditions corresponding to the protocol how the data was generated. This gives you much more flexibility and reusability and tools can focus on certain subsets of the problem.

eraimundez commented 8 months ago

Just to add on @matthiaskoenig reply: Even for more "simple" popPK models (simple w.r.t to PBPK models πŸ˜‰) that could rely on a single dataset this would still be benefitial.

For example:

It is a different point of view: what in pharmacometrics is considered to be a model (and having all in a single file, at least in NONMEM), in PEtab is considered as a modular "problem" where the yaml file puts all pieces together and allowing the user to reuse any of the individual parts.