Open-Systems-Pharmacology / OSPSuite-R

R package for the OSPSuite
https://www.open-systems-pharmacology.org/OSPSuite-R/
Other
28 stars 12 forks source link

`DataCombined$addSimulationResults()`: new argument `lowestThreshold` #1164

Open PavelBal opened 1 year ago

PavelBal commented 1 year ago

Comparing simulation results with observed data may become a challenge when reported observed values are 0 or below a lower limit of quantification (BLOQ). By desing, simulated values will never become 0, and simulated values BLOQ of observed data cannot be reliably compared with the data.

In both cases, both simulated and observed values can be substituted by a defined value, that would allow their direct comparios. E.g, zeros in the observed data can be replaced by a threshold value which is the detection limit of the assay. In this case, all simulated values below this threshold should also be set to the threshold, and we can assume them as "not detected".

I propose to add an argument lowestThreshold = NULL to the method of adding simulation results to DataCombined (and, optionally, to adding DataSet of observed data). If defined, all values below this threshold will be replaced by the threshold.

@Yuri05 @msevestre @svavil what do you think?

svavil commented 1 year ago

If we go along the route of modifying the underlying data, it's important to use the same threshold across addSimulationResults and addDataset functions.

What matters to me is that a simulation result of 1e-20 and an observed value of 1e-17 produced a corresponding residuals value of 0 and a corresponding fold value of 1, not 1e-17 and 1e3. We can do this by either replacing the values in underlying data with a threshold value, or modifying the calculateResiduals function to work with unmodified data and a threshold value (and produce folds, while we are at that).

Yuri05 commented 1 year ago

https://github.com/Open-Systems-Pharmacology/OSPSuite.Core/issues/1736

Yuri05 commented 1 year ago

Additionally: for simulated values, OSPSuite.Core already calculates a comparison threshold (based on the used absolute tolerance and scale factor for each quantity) - all simulated values below can be considered as numerical zero. But I guess it's not propagated to R yet.

Yuri05 commented 1 year ago

I propose to add an argument lowestThreshold = NULL to the method of adding simulation results to DataCombined (and, optionally, to adding DataSet of observed data). If defined, all values below this threshold will be replaced by the threshold.

The values should never be replaced in the datacombined object itself. They should only be replaced e.g. during calculation of residuals, plotting, etc.

If we go along the route of modifying the underlying data, it's important to use the same threshold across addSimulationResults and addDataset functions.

I disagree. Each simulated output and each observed data set have in general a different threshold (LLOQ). How those thresholds are used in calculations is another question. E.g. we could provide different methods of residual calculation based on the thresholds:

  1. replace values below their corresponding LLOQ with LLOQ/2 for each output and calculate residual based on these values
  2. same as 1., but if both simulated and observed values are below their corresponding LLOQ: calculate residual as zero and fold error as one
  3. same as 1., but if both simulated and observed values are below their corresponding LLOQ: remove this {simulated; observed} values pair from the residual calculation/representation completely
  4. ...

s. also https://docs.open-systems-pharmacology.org/shared-tools-and-example-workflows/parameter-identification#handling-of-lloq-values

svavil commented 1 year ago

@Yuri05 Thank you, I was meaning to agree with your option no. 2 for the residual / fold calculation.