Feature Request: Custom Likelihood Function

billdenney commented 4 years ago

One of the challenges for me fully implementing nlmixr in my workflow is that I need custom likelihood functions for some projects. One straight-forward example is Beal's M3 method which I commonly use for handling concentrations below the limit of quantification. Less simple examples include the use of discrete distributions for integer data and combinations of distributions for more complex data (one example recently was a triangular distribution plus a normal distribution to handle assay-related challenges).

nlmixr currently has a good number of distributions to handle a wide variety of common scenarios, but examples like my triangular plus normal distribution in my opinion should never be handled within the core of the project as they are too esoteric.

My thought is that a form like the following would be the way to make it happen:

cp~residCustom(fun_name="string", dv, ipred, ...)

(Argument order can be whatever makes the most sense. I think that support for named arguments is almost a must to ensure that errors don't occur between C++ and R.)

Where:

residCustom() is a pseudo-function that exists for use in model blocks to indicate that a custom R function will be called for calculating the likelihood.
fun_name is a string argument indicating the function name in R to call.
dv and ipred are the observed values and the predicted values.
... are any other arguments needed to calculate the likelihood.

fun_name(dv=dv, ipred=ipred, ...) will return a likelihood (there could be an option for residCustom() to have it return likelihood up to a constant). It must accept a vector of input values for each input and it must return a numeric vector the same length as dv and ipred.

As an example, the calling syntax for Beal's M3 method could be Beal_M3(dv, ipred, lloq, sd) where dv would come from the input data set, lloq would likely (but not necessarily) come from the input data set, and sd would be calculated as the vector of current standard deviations.

I realize that this would be slower than calculation directly in C++, but the user would inherently accept such a (hopefully small) speed hit for the flexibility.

For the simulation side of things, there would probably need to be some form of sampling indicator to fun_name (e.g. fun_name(..., sample=TRUE).

Related to #182

mattfidler commented 4 years ago

The M3 method isn't documented yet, but you can use the limit and cens columns in focei to use it. It uses the same conventions as monolix

On Sun, Jul 5, 2020, 9:29 AM Bill Denney notifications@github.com wrote:

One of the challenges for me fully implementing nlmixr in my workflow is that I need custom likelihood functions for some projects. One straight-forward example is Beal's M3 method which I commonly use for handling concentrations below the limit of quantification. Less simple examples include the use of discrete distributions for integer data and combinations of distributions for more complex data (one example recently was a triangular distribution plus a normal distribution to handle assay-related challenges).

nlmixr currently has a good number of distributions to handle a wide variety of common scenarios, but examples like my triangular plus normal distribution in my opinion should never be handled within the core of the project as they are too esoteric.

My thought is that a form like the following would be the way to make it happen:

cp~residCustom(fun_name="string", dv, ipred, ...)

(Argument order can be whatever makes the most sense. I think that support for named arguments is almost a must to ensure that errors don't occur between C++ and R.)

Where:

residCustom() is a pseudo-function that exists for use in model blocks to indicate that a custom R function will be called for calculating the likelihood.

fun_name is a string argument indicating the function name in R to call.

dv and ipred are the observed values and the predicted values.

... are any other arguments needed to calculate the likelihood.

fun_name(dv=dv, ipred=ipred, ...) will return a likelihood (there could be an option for residCustom() to have it return likelihood up to a constant). It must accept a vector of input values for each input and it must return a numeric vector the same length as dv and ipred.

As an example, the calling syntax for Beal's M3 method could be Beal_M3(dv, ipred, lloq, sd) where dv would come from the input data set, lloq would likely (but not necessarily) come from the input data set, and sd would be calculated as the vector of current standard deviations.

I realize that this would be slower than calculation directly in C++, but the user would inherently accept such a (hopefully small) speed hit for the flexibility.

For the simulation side of things, there would probably need to be some form of sampling indicator to fun_name (e.g. fun_name(..., sample=TRUE).

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/nlmixrdevelopment/nlmixr/issues/355, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAD5VWS4X22IZQ3AIIHFKATR2CE23ANCNFSM4OQ4W6GA .

mattfidler commented 4 years ago

This is closest to the current implemented censoring support

http://monolix2016.lixoft.com/data-and-models/censoreddata/

The more recent monolix distributions changed column names.

Also, there is gnlmm and gnlmm2 that do what you suggest; They are almost completely in R so are slower than their C++ counterparts. Their interface is very close to the low level foceiFit, which means they should be closer to convert from the nlmixr functions to this interface.

With large changes like this where there isn't a R counterpart I typically propose how to handle things and get feedback from the core team before I start implementing it.

mattfidler commented 4 years ago

I am likely to remain close to how gnlmm2 and gnlmm handle things currently in some translation to the UI.

billdenney commented 4 years ago

Thanks to the pointer to gnlmm and gnlmm2. It looks like they evaluate their likelihood functions in the environment of the model with access to all parameters in the model (and then I assume following on from that the global environment).

As I'm looking at the gnlmm documentation, I that the most relevant proposal would be to add a new, optional llik block to the model (so now ini, llik, and model would be the three inputs, and model would be last so that it could find both ini and llik).

mattfidler commented 4 years ago

The focei also has an error component like the llik, so I'm unsure if that is the route I'm going to use. The easiest is to keep the model as is, and some way to specify the likelihood. The error types for gnlmm2 are defined by the UI so far, but the custom liklihoods need more work.

This will probably be a longer-term request (at least until the rest of the todo list gets complete).

jpryby commented 4 years ago

Hi Matt and Bill,

I've recently been working with the gnlmm function a lot and am running some tests comparing it with NONMEM, so funny this is being discussed now. My biggest problem with it that will hopefully be addressed with this issue is that the function output is not treated as an nlmixr object, so all the conveniences w.r.t. xpose.nlmixr, shinymixR and various plotting functions don't apply. Because all NONMEM output generally looks the same, it's easy enough to process NONMEM reports from custom likelihood fits, whereas with nlmixr the output for gnlmm is just the optim (or lbfgs) output which has to be processed completely differently than nlmixr objects.

I have some other minor concerns with the current gnlmm implementation, the greatest of these being the paucity of documentation for it and the lack of detailed examples that are present for the other types of models; I have had to essentially go through parts of the function line by line in order to understand how to run it. Already addressed is the issue of speed, but NONMEM without parallelization can be slow with these types of models, too, so doing an all-c++ implementation probably wouldn't be necessary.

Happy to help with this in any way I can.

John

mattfidler commented 4 years ago

Thanks John!

It seems like it would be fairly straightforward task, but would need some attention under the hood. But please understand that these will not likely be able to be plotted, or used with xpose.nlmixr.

With log-liklihood methods, it would very difficult to understand what is meant by a goodness of fit plot (is it a odd ratio, a t-distribution, etc etc).

However, it will produce a reduced nlmixr model; You can make nlmixr produce this in a simple way, adding calcTables=FALSE

library(nlmixr)
one.cmt <- function() {
  ini({
    ## You may label each parameter with a comment
    tka <- 0.45 # Log Ka
    tcl <- 1 # Log Cl
    ## This works with interactive models
    ## You may also label the preceding line with label("label text")
    tv <- 3.45; label("log V")
    ## the label("Label name") works with all models
    eta.ka ~ 0.6
    eta.cl ~ 0.3
    eta.v ~ 0.1
    add.sd <- 0.7
  })
  model({
    ka <- exp(tka + eta.ka)
    cl <- exp(tcl + eta.cl)
    v <- exp(tv + eta.v)
    linCmt() ~ add(add.sd)
  })
}
fit <- nlmixr(one.cmt, theo_sd, est="focei", control=foceiControl(calcTables=FALSE, print=0))
#> ℹ parameter labels from comments will be replaced by 'label()'
#> → creating full model...
#> → pruning branches (`if`/`else`)...
#> ✔ done
#> → loading into symengine environment...
#> ✔ done
#> → creating full model...
#> → pruning branches (`if`/`else`)...
#> ✔ done
#> → loading into symengine environment...
#> ✔ done
#> → calculate jacobian
#> → calculate ∂(f)/∂(η)
#> [====|====|====|====|====|====|====|====|====|====] 0:00:00
#> → calculate ∂(R²)/∂(η)
#> [====|====|====|====|====|====|====|====|====|====] 0:00:00
#> → finding duplicate expressions in inner model...
#> [====|====|====|====|====|====|====|====|====|====] 0:00:00
#> → optimizing duplicate expressions in inner model...
#> [====|====|====|====|====|====|====|====|====|====] 0:00:00
#> → finding duplicate expressions in EBE model...
#> [====|====|====|====|====|====|====|====|====|====] 0:00:00
#> → optimizing duplicate expressions in EBE model...
#> [====|====|====|====|====|====|====|====|====|====] 0:00:00
#> → compiling inner model...
#> ✔ done
#> → compiling EBE model...
#> ✔ done
#> Calculating covariance matrix
#> [====|====|====|====|====|====|====|====|====|====] 0:00:00 
#> done
#> Warning in (function (uif, data, est = NULL, control = list(), ..., sum.prod =
#> FALSE, : Initial ETAs were nudged; (Can control by foceiControl(etaNudge=.))
#> Warning in (function (uif, data, est = NULL, control = list(), ..., sum.prod
#> = FALSE, : ETAs were reset to zero during optimization; (Can control by
#> foceiControl(resetEtaP=.))
#> Warning in (function (uif, data, est = NULL, control = list(), ..., sum.prod =
#> FALSE, : Gradient problems with initial estimate and covariance; see $scaleInfo

print(fit)
#> ── nlmixr FOCEi (outer: nlminb) fit ────────────────────────────────────────────
#> 
#> 
#>           OBJF      AIC     BIC Log-likelihood Condition Number
#> FOCEi 116.8047 373.4044 393.584      -179.7022         68.60811
#> ── Time (sec $time): ───────────────────────────────────────────────────────────
#> 
#> 
#>            setup optimize covariance    other
#> elapsed 2.164155 0.132252   0.132254 0.417339
#> ── Population Parameters ($parFixed or $parFixedDf): ───────────────────────────
#> 
#> 
#>        Parameter  Est.     SE %RSE Back-transformed(95%CI) BSV(CV%) Shrink(SD)%
#> tka       Log Ka 0.464  0.195   42       1.59 (1.09, 2.33)     70.7      2.05% 
#> tcl       Log Cl  1.01 0.0751 7.42       2.75 (2.38, 3.19)     26.7      3.90% 
#> tv         log V  3.46 0.0436 1.26       31.8 (29.2, 34.6)     14.0      10.6% 
#> add.sd           0.694                               0.694                     
#>  
#>   Covariance Type ($covMethod): r,s
#>   Some strong fixed parameter correlations exist ($cor) :
#>     cor:tcl,tka  cor:tv,tka  cor:tv,tcl 
#>      0.158       0.422       0.744  
#>  
#> 
#>   No correlations in between subject variability (BSV) matrix
#>   Full BSV covariance ($omega) or correlation ($omegaR; diagonals=SDs) 
#>   Distribution stats (mean/skewness/kurtosis/p-value) available in $shrink 
#>   Minimization message ($message):  
#>     relative convergence (4)

^{Created on 2020-07-07 by the reprex package (v0.3.0)}

common methods like plot will not work...

mattfidler commented 4 years ago

A different feature is using logLik for only individual points, like with the m3 method.

mattfidler commented 4 years ago

This is related to #347; When nlmixr produces reduced models, many method are not worked out.

jpryby commented 4 years ago

Thanks Matt, that makes sense. I agree that plotting functions and xpose plots would be difficult and case-specific for log-likelihood models and couldn't be implemented so easily; allowing users to define IPRED and PRED (separately from Y, in NONMEM) for these models would be valuable for case-specific plotting, though not necessary. However, it also doesn't work as a model type for shinymixR as far as I know (@richardhooijmaijers), probably in part because it's not treated as an nlmixr object (also partly because it's a fairly obscure part of the package, as this thread shows).

I'm not sure if I understand the comparison of gnlmm output to foce with calcTables=FALSE. I'm usually using gnlmm to fit count, binary and other non-continuous data where I can't use the standard nlmixr model construction, so maybe that's part of the issue. Also I'm far from an expert in the stats behind the scenes. However, the output from gnlmm is a lot different than the output from nlmixr(). An example from the documentation:

library(nlmixr)
llik <- function()
{
  lp = THETA[1]*x1+THETA[2]*x2+(x1+x2*THETA[3])*ETA[1]
  p = pnorm(lp)
  dbinom(x, m, p, log=TRUE)
}
inits = list(THTA=c(1,1,1), OMGA=list(ETA[1]~1))

fit <- gnlmm(llik, rats, inits, control=list(nAQD=3))
print(fit)
#> $par
#> [1] 1.3434024 0.9966831 3.2844304 1.7829125
#> 
#> $value
#> [1] 105.5922
#> 
#> $counts
#> function gradient 
#>       93       NA 
#> 
#> $convergence
#> [1] 0
#> 
#> $message
#> NULL
summary(fit)
#>             Length Class  Mode     
#> par          4     -none- numeric  
#> value        1     -none- numeric  
#> counts       2     -none- numeric  
#> convergence  1     -none- numeric  
#> message      0     -none- NULL     
#> obj          1     -none- function 
#> ETA         32     -none- numeric  
#> con         14     -none- list     
#> diag.xform   1     -none- character
#> nsplt        4     -none- numeric  
#> osplt        1     -none- numeric  
#> calls        3     -none- list
class(fit)
#> [1] "gnlmm.fit"

^{Created on 2020-07-07 by the reprex package (v0.3.0)} It is essentially just optim output with some extra items for use by calcCov and getOMEGA. Even without the nice styling of the nlmixr object, the named list items are much different from those in an nlmixr object.

mattfidler commented 4 years ago

Yes. The nice uniform nature of the nlmixr function and output is needed here.

Note that saem is similar to gnlmm in the output is much different from the nlmixr object; After SAEM is run, it is converted to a nlmixr object. So the current interface is

nlmixr function -> saem interface -> saem run -> nlmixr Object

The nlmixr output is based on focei, so the outputs are similar to focei.

The output that will eventually be produced would be the reduced nlmixr function above (ie the one with calcTables=FALSE). There may be a feature to output IPRED/PRED manually for certain models, but that is further down in scope.

nlmixrdevelopment / nlmixr

Feature Request: Custom Likelihood Function #355