nlmixr2 / nlmixr2est

nlmixr2 estimation routines
https://nlmixr2.github.io/nlmixr2est/
GNU General Public License v2.0
7 stars 3 forks source link

Suspected bug in nlmixr2, combined IV and depot administration with EVID=4 reset+dose events #455

Open PyryValitalo opened 1 month ago

PyryValitalo commented 1 month ago

Hi,

Using latest version of nlmixr2, I simulate data and then try to estimate the posthoc random effect values for the simulated dataset. I was using the same parameter values for simulation and posthoc estimation. However the posthoc predictions are essentially flat lines, which suggests a bug occurs somewhere. I also tried fitting the model to the simulated data via SAEM and focei algorithms, and the two algorithms seem to fail differently. "posthocs" and "focei" end up saying that PRED is constant for the whole dataset, whereas SAEM at least manages to give different values for PRED.

I am aware that there should be no random effects on F when using the fo family of algorithms, reference: https://github.com/nlmixrdevelopment/nlmixr/issues/344#issuecomment-653257427

I simulate with variability on F, and estimate with SAEM while assuming variability on F. For "posthocs" and "focei" estimation, no random effect on F is specified.

It is noteworthy that the rxode2 simulation seems to work as desired (produces simulated time-conc plots that look like what is expected), but the estimation steps do not seem to work as desired. I suspect the trouble could be in rxode2 - nlmixr2 integration, and not in rxode2 itself.

A reproducible and fairly minimal example follows. I realize that the parameter values might not be perfectly physiological, that is because I ended up with the example by reducing my original model, instead of crafting the example from scratch.

## libraries
library(tidyverse)
library(nlmixr2)
library(rxode2)
## test model
basemodel1 <- function() {
  ini({
    tcl1 <- log(2.1); label("Cl1")
    wtcl1 <- (.75); label("wtCL1")
    tvc <- log(25) ; label("Vc")
    wtv <- (1); label("wtV")
    tka <- log(.8) ; label("ka")
    flogit <- log(4)
    ## random effects
    eta.cl1 ~ 0.3
    eta.vc ~ 0.3
    eta.ka ~ 0.3
    eta.f ~0.3
    add.sd <- c(0,0.2,5)
  })
  # and a model block with the error specification and model specification
  model({
    cl <- exp(tcl1 + eta.cl1+logwt*wtcl1)
    vc <- exp(tvc + eta.vc+logwt*wtv)
    ka <- exp(tka + eta.ka)
    ftemp <- exp(flogit+eta.f)
    f <- ftemp/(1+ftemp)
    f(depot) <- f
    cp <- linCmt()
    lcp <- log(cp)
    lcp ~ add(add.sd)
  })
}
## check correct mu referencing,make another model with no BSV on f,
## because "For the fo family of methods, the bioavailibility, rate, duration and lag times will work as long as there is only one for the whole population"
## reference: https://github.com/nlmixrdevelopment/nlmixr/issues/344#issuecomment-653257427
nlmixr(basemodel1)
basemodel2 <- basemodel1 %>% 
  model(ftemp=exp(flogit)) 
## dataset of log-uniform distribution of nID WT values between 10kg and 70kg, 
## all of whom receive both IV and IM administrations
## log-wt is pre-coded as a covariate, and is constant within an individual
nID <- 50
wtseq <- exp(seq(log(10),log(70),length.out=nID))
PKtimes <- c(0,.25,.5,1,2,5,12)
## simulate data from model
dftemplate <- do.call(rbind,map(seq_len(nID),~tibble(id=.,WT=wtseq[.],logwt=log(WT/70),time=PKtimes,AMT=ifelse(time==0,100*WT,NA),
                                                     CMT=1+(is.na(AMT)),DV=NA,evid=ifelse(time==0,4,0),TRT=1) %>% 
                                  rbind(.,mutate(.,CMT=2,TRT=2))))
dfres <- rxSolve(basemodel1, dftemplate) 
df <- dftemplate
df$DV[is.na(df$AMT)] <- dfres$sim
## observe the simulated data
ggplot(df,aes(time,DV,group=id))+geom_line()+facet_wrap(~TRT)
### fit models to simulated data
## posthoc
m0 <- nlmixr2(basemodel2,df,"posthoc")
m0 <- mutate(m0,TRT=dftemplate$TRT[is.na(dftemplate$AMT)])
m0 %>% gather(var,val,IPRED,DV) %>% 
  mutate(TRT=c("Depot","IV")[TRT]) %>% 
  ggplot(aes(TIME,val,group=interaction(ID,var),col=var))+geom_line()+facet_wrap(~TRT)
### complementary fittings
## saem
m1 <- nlmixr2(basemodel1,df,"saem",control=saemControl(covMethod = "s"))
m1 <- mutate(m1,TRT=dftemplate$TRT[is.na(dftemplate$AMT)])
ggplot(m1,aes(TIME,IPRED,group=ID))+geom_line()+geom_point(aes(y=DV))+facet_wrap(~TRT)
## focei
m2 <- nlmixr2(basemodel2,df,"focei",control=foceiControl(covMethod = "s"))
m2 <- mutate(m2,TRT=dftemplate$TRT[is.na(dftemplate$AMT)])
ggplot(m2,aes(TIME,IPRED,group=ID))+geom_line()+geom_point(aes(y=DV))+facet_wrap(~TRT)

A relevant plot of posthoc predictions (IPRED) and simulated data (DV, log-concentrations) versus TIME:

image

> sessionInfo()
R version 4.3.2 (2023-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 11 x64 (build 22631)

Matrix products: default

locale:
[1] LC_COLLATE=English_Finland.utf8  LC_CTYPE=English_Finland.utf8    LC_MONETARY=English_Finland.utf8 LC_NUMERIC=C                     LC_TIME=English_Finland.utf8    

time zone: Europe/Helsinki
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] rxode2_2.1.3      nlmixr2_2.1.2     nlmixr2data_2.0.9 lubridate_1.9.3   forcats_1.0.0     stringr_1.5.1     dplyr_1.1.4       purrr_1.0.2       readr_2.1.5       tidyr_1.3.1       tibble_3.2.1      ggplot2_3.5.0    
[13] tidyverse_2.0.0  

loaded via a namespace (and not attached):
 [1] gtable_0.3.4        n1qn1_6.0.1-11      xfun_0.43           RApiSerialize_0.1.3 lattice_0.21-9      tzdb_0.4.0          vctrs_0.6.5         tools_4.3.2         generics_0.1.3      symengine_0.2.6     fansi_1.0.6        
[12] pkgconfig_2.0.3     Matrix_1.6-1.1      data.table_1.15.2   checkmate_2.3.1     RcppParallel_5.1.7  lifecycle_1.0.4     farver_2.1.1        compiler_4.3.2      munsell_0.5.0       qs_0.26.3           sys_3.4.2          
[23] lazyeval_0.2.2      pillar_1.9.0        crayon_1.5.2        cachem_1.0.8        nlme_3.1-163        tidyselect_1.2.1    digest_0.6.35       lotri_0.4.3         stringi_1.8.3       labeling_0.4.3      nlmixr2est_2.2.2   
[34] rxode2ll_2.0.11     fastmap_1.1.1       grid_4.3.2          colorspace_2.1-0    rxode2parse_2.0.19  cli_3.6.2           dparser_1.3.1-11    magrittr_2.0.3      utf8_1.2.4          withr_3.0.0         scales_1.3.0       
[45] backports_1.5.0     timechange_0.3.0    hms_1.1.3           stringfish_0.16.0   nlmixr2plot_2.0.9   memoise_2.0.1       knitr_1.45          rex_1.2.1           rxode2et_2.0.13     rxode2random_2.1.1  PreciseSums_0.6    
[56] rlang_1.1.3         Rcpp_1.0.12         glue_1.7.0          lbfgsb3c_2024-3.4   rstudioapi_0.16.0   R6_2.5.1       

As I mentioned, the "posthocs" and "focei" algorithms end up claiming that PRED is constant for all individuals in the dataset, whereas SAEM does not. Of note, I have coded the inits so that f ends up being 0.8, and the "posthocs" and "focei" algorithms seem to conclude that f is the prediction (since PRED=0.8 for them), when they should only interpret f as a scaling factor for the actual concentration predictions.

> cbind(posthocMethod=m0$PRED,foceiMethod=m2$PRED,saemMethod=m1$PRED) %>% summary()
 posthocMethod  foceiMethod       saemMethod       
 Min.   :0.8   Min.   :0.8001   Min.   :-551.1857  
 1st Qu.:0.8   1st Qu.:0.8001   1st Qu.:  -0.9453  
 Median :0.8   Median :0.8001   Median :   4.4810  
 Mean   :0.8   Mean   :0.8001   Mean   : -12.7640  
 3rd Qu.:0.8   3rd Qu.:0.8001   3rd Qu.:   5.0241  
 Max.   :0.8   Max.   :0.8001   Max.   :   5.8948 
mattfidler commented 1 month ago

Hi @PyryValitalo

Thank you for your patience and giving me reproducible example to work from.

I will try to reproduce this shortly.

Before you begin, I wanted to make you aware that we can have random effects on f() and any of the other dose-based adjustments in the current version of nlmixr2

mattfidler commented 1 month ago

Indeed, I can reproduce this:

## libraries
library(tidyverse)
library(nlmixr2)
#> Loading required package: nlmixr2data
library(rxode2)
#> rxode2 2.1.3.9000 using 8 threads (see ?getRxThreads)
#>   no cache: create with `rxCreateCache()`
## test model
basemodel1 <- function() {
  ini({
    tcl1 <- log(2.1); label("Cl1")
    wtcl1 <- (.75); label("wtCL1")
    tvc <- log(25) ; label("Vc")
    wtv <- (1); label("wtV")
    tka <- log(.8) ; label("ka")
    flogit <- log(4)
    ## random effects
    eta.cl1 ~ 0.3
    eta.vc ~ 0.3
    eta.ka ~ 0.3
    eta.f ~0.3
    add.sd <- c(0,0.2,5)
  })
  # and a model block with the error specification and model specification
  model({
    cl <- exp(tcl1 + eta.cl1+logwt*wtcl1)
    vc <- exp(tvc + eta.vc+logwt*wtv)
    ka <- exp(tka + eta.ka)
    ftemp <- exp(flogit+eta.f)
    f <- ftemp/(1+ftemp)
    f(depot) <- f
    cp <- linCmt()
    lcp <- log(cp)
    lcp ~ add(add.sd)
  })
}
## check correct mu referencing,make another model with no BSV on f,
## because "For the fo family of methods, the bioavailibility, rate, duration and lag times will work as long as there is only one for the whole population"
## reference: https://github.com/nlmixrdevelopment/nlmixr/issues/344#issuecomment-653257427
nlmixr(basemodel1)
#>  ── rxode2-based solved PK 1-compartment model with first-order absorption ────── 
#>  ── Initalization: ──  
#> Fixed Effects ($theta): 
#>       tcl1      wtcl1        tvc        wtv        tka     flogit     add.sd 
#>  0.7419373  0.7500000  3.2188758  1.0000000 -0.2231436  1.3862944  0.2000000 
#> 
#> Omega ($omega): 
#>         eta.cl1 eta.vc eta.ka eta.f
#> eta.cl1     0.3    0.0    0.0   0.0
#> eta.vc      0.0    0.3    0.0   0.0
#> eta.ka      0.0    0.0    0.3   0.0
#> eta.f       0.0    0.0    0.0   0.3
#>  ── μ-referencing ($muRefTable): ──  
#>    theta     eta level  covariates
#> 1   tcl1 eta.cl1    id logwt*wtcl1
#> 2    tvc  eta.vc    id   logwt*wtv
#> 3    tka  eta.ka    id            
#> 4 flogit   eta.f    id            
#> 
#>  ── Model (Normalized Syntax): ── 
#> function() {
#>     ini({
#>         tcl1 <- 0.741937344729377
#>         label("Cl1")
#>         wtcl1 <- 0.75
#>         label("wtCL1")
#>         tvc <- 3.2188758248682
#>         label("Vc")
#>         wtv <- 1
#>         label("wtV")
#>         tka <- -0.22314355131421
#>         label("ka")
#>         flogit <- 1.38629436111989
#>         add.sd <- c(0, 0.2, 5)
#>         eta.cl1 ~ 0.3
#>         eta.vc ~ 0.3
#>         eta.ka ~ 0.3
#>         eta.f ~ 0.3
#>     })
#>     model({
#>         cl <- exp(tcl1 + eta.cl1 + logwt * wtcl1)
#>         vc <- exp(tvc + eta.vc + logwt * wtv)
#>         ka <- exp(tka + eta.ka)
#>         ftemp <- exp(flogit + eta.f)
#>         f <- ftemp/(1 + ftemp)
#>         f(depot) <- f
#>         cp <- linCmt()
#>         lcp <- log(cp)
#>         lcp ~ add(add.sd)
#>     })
#> }
basemodel2 <- basemodel1 %>%
  model(ftemp=exp(flogit))
#> ! remove between subject variability `eta.f`
## dataset of log-uniform distribution of nID WT values between 10kg and 70kg,
## all of whom receive both IV and IM administrations
## log-wt is pre-coded as a covariate, and is constant within an individual
nID <- 50
wtseq <- exp(seq(log(10),log(70),length.out=nID))
PKtimes <- c(0,.25,.5,1,2,5,12)
## simulate data from model
dftemplate <- do.call(rbind,map(seq_len(nID),~tibble(id=.,WT=wtseq[.],logwt=log(WT/70),time=PKtimes,AMT=ifelse(time==0,100*WT,NA),
                                                     CMT=1+(is.na(AMT)),DV=NA,evid=ifelse(time==0,4,0),TRT=1) %>%
                                               rbind(.,mutate(.,CMT=2,TRT=2))))
dfres <- rxSolve(basemodel1, dftemplate)
df <- dftemplate
df$DV[is.na(df$AMT)] <- dfres$sim
## observe the simulated data
ggplot(df,aes(time,DV,group=id))+geom_line()+facet_wrap(~TRT)
#> Warning: Removed 50 rows containing missing values or values outside the scale range
#> (`geom_line()`).

### fit models to simulated data
## posthoc
m0 <- nlmixr2(basemodel2,df,"posthoc")
#> → Calculating residuals/tables
#> ✔ done
#> → compress origData in nlmixr2 object, save 43064
m0 <- mutate(m0,TRT=dftemplate$TRT[is.na(dftemplate$AMT)])
m0 %>% gather(var,val,IPRED,DV) %>%
  mutate(TRT=c("Depot","IV")[TRT]) %>%
  ggplot(aes(TIME,val,group=interaction(ID,var),col=var))+geom_line()+facet_wrap(~TRT)

### complementary fittings
## saem
m1 <- nlmixr2(basemodel1,df,"saem",control=saemControl(covMethod = "s", print=0))
#> → loading into symengine environment...
#> → pruning branches (`if`/`else`) of saem model...
#> ✔ done
#> → finding duplicate expressions in saem model...
#> [====|====|====|====|====|====|====|====|====|====] 0:00:00
#> → optimizing duplicate expressions in saem model...
#> [====|====|====|====|====|====|====|====|====|====] 0:00:00
#> ✔ done
#> ℹ calculate uninformed etas
#> ℹ done
#> → loading into symengine environment...
#> → pruning branches (`if`/`else`) of saem model...
#> ✔ done
#> → finding duplicate expressions in saem predOnly model 0...
#> [====|====|====|====|====|====|====|====|====|====] 0:00:00
#> → finding duplicate expressions in saem predOnly model 1...
#> [====|====|====|====|====|====|====|====|====|====] 0:00:00
#> → optimizing duplicate expressions in saem predOnly model 1...
#> [====|====|====|====|====|====|====|====|====|====] 0:00:00
#> → finding duplicate expressions in saem predOnly model 2...
#> [====|====|====|====|====|====|====|====|====|====] 0:00:00
#> → optimizing duplicate expressions in saem predOnly model 2...
#> [====|====|====|====|====|====|====|====|====|====] 0:00:00
#> ✔ done
#> → Calculating residuals/tables
#> ✔ done
#> → compress origData in nlmixr2 object, save 43064
#> → compress phiM in nlmixr2 object, save 315640
#> → compress parHistData in nlmixr2 object, save 16872
m1 <- mutate(m1,TRT=dftemplate$TRT[is.na(dftemplate$AMT)])
ggplot(m1,aes(TIME,IPRED,group=ID))+geom_line()+geom_point(aes(y=DV))+facet_wrap(~TRT)

## focei
m2 <- nlmixr2(basemodel2,df,"focei",control=foceiControl(covMethod = "s", print=0))
#> calculating covariance matrix
#> [====|====|====|====|====|====|====|====|====|====] 0:00:02 
#> done
#> → Calculating residuals/tables
#> ✔ done
#> → compress origData in nlmixr2 object, save 43064
#> → compress parHistData in nlmixr2 object, save 6360
m2 <- mutate(m2,TRT=dftemplate$TRT[is.na(dftemplate$AMT)])
ggplot(m2,aes(TIME,IPRED,group=ID))+geom_line()+geom_point(aes(y=DV))+facet_wrap(~TRT)

Created on 2024-07-18 with reprex v2.1.1

mattfidler commented 1 month ago

A couple of other things that I will mention before I post a possible work-around in your situation:

ini({
...
    flogit <- logit(0.8)
...
})
model({
...
f(depot) <- expit(flogit + eta.f)
...
})

In the fit you would also see the back-transformed values:

── Population Parameters (m0$parFixed or m0$parFixedDf): ──

         Parameter   Est. Back-transformed BSV(CV% or SD) Shrink(SD)%
tcl1           Cl1  0.742              2.1           59.1      17.5% 
wtcl1        wtCL1   0.75             0.75                           
tvc             Vc   3.22               25           59.1   -0.0587% 
wtv            wtV      1                1                           
tka             ka -0.223              0.8           59.1      11.2% 
flogit               1.39              0.8          0.548      100.% 
lnorm.sd              0.2              0.2                           

This allows objective functions that are comparable between additive error, proportional error etc. It will also produce residuals simulations etc on the normal scale.

With that being said, the linCmt() solutions sometimes are not working correctly. If you switch to ODEs these are working correctly.

I am actively working toward a new linear compartment solution based on the R package wnl. The advan solutions have had some stability in certain edge cases. I don't believe the evids also behave exactly the same with linear compartmental solutions and odes, so this may be the issue as well.

Modifyiing using logit/expit and lnorm() residuals, I have the following:

## libraries
library(tidyverse)
library(nlmixr2)
#> Loading required package: nlmixr2data
library(rxode2)
#> rxode2 2.1.3.9000 using 8 threads (see ?getRxThreads)
#>   no cache: create with `rxCreateCache()`

set.seed(1024)
rxSetSeed(1024)

## test model
basemodel1 <- function() {
  ini({
    tcl1 <- log(2.1); label("Cl1")
    wtcl1 <- (.75); label("wtCL1")
    tvc <- log(25) ; label("Vc")
    wtv <- (1); label("wtV")
    tka <- log(.8) ; label("ka")
    flogit <- logit(0.8)
    ## random effects
    eta.cl1 ~ 0.3
    eta.vc ~ 0.3
    eta.ka ~ 0.3
    eta.f ~0.3
    lnorm.sd <- c(0,0.2,5)
  })
  # and a model block with the error specification and model specification
  model({
    cl <- exp(tcl1 + eta.cl1+logwt*wtcl1)
    vc <- exp(tvc + eta.vc+logwt*wtv)
    ka <- exp(tka + eta.ka)
    kel <- cl / vc
    d/dt(depot) <- -ka * depot
    d/dt(central) <- ka * depot - kel * central
    f(depot) <- expit(flogit + eta.f)
    cp <- central / vc
    cp ~ lnorm(lnorm.sd)
  })
}

## dataset of log-uniform distribution of nID WT values between 10kg and 70kg,
## all of whom receive both IV and IM administrations
## log-wt is pre-coded as a covariate, and is constant within an individual
nID <- 50
wtseq <- exp(seq(log(10),log(70),length.out=nID))
PKtimes <- c(0,.25,.5,1,2,5,12)
## simulate data from model
dftemplate <- do.call(rbind,map(seq_len(nID),~tibble(id=.,WT=wtseq[.],logwt=log(WT/70),time=PKtimes,AMT=ifelse(time==0,100*WT,NA),
                                                     CMT=1+(is.na(AMT)),DV=NA,evid=ifelse(time==0,4,0),TRT=1) %>%
                                               rbind(.,mutate(.,CMT=2,TRT=2))))
dfres <- rxSolve(basemodel1, dftemplate)

df <- dftemplate
df$DV[is.na(df$AMT)] <- dfres$sim

## observe the simulated data
ggplot(df,aes(time,DV,group=id))+geom_line()+facet_wrap(~TRT)
#> Warning: Removed 50 rows containing missing values or values outside the scale range
#> (`geom_line()`).


### fit models to simulated data
## posthoc
m0 <- nlmixr2(basemodel1,df,"posthoc")
#> → Calculating residuals/tables
#> ✔ done
#> → compress origData in nlmixr2 object, save 43024

plot(augPred(m0))


m0 <- mutate(m0,TRT=dftemplate$TRT[is.na(dftemplate$AMT)])
m0 %>% gather(var,val,IPRED,DV) %>%
  mutate(TRT=c("Depot","IV")[TRT]) %>%
  ggplot(aes(TIME,val,group=interaction(ID,var),col=var))+geom_line()+facet_wrap(~TRT)


### complementary fittings
## saem
m1 <- nlmixr2(basemodel1,df,"saem",control=saemControl(covMethod = "s", print=0))
#> → loading into symengine environment...
#> → pruning branches (`if`/`else`) of saem model...
#> ✔ done
#> → finding duplicate expressions in saem model...
#> [====|====|====|====|====|====|====|====|====|====] 0:00:00
#> → optimizing duplicate expressions in saem model...
#> [====|====|====|====|====|====|====|====|====|====] 0:00:00
#> ✔ done
#> ℹ calculate uninformed etas
#> ℹ done
#> → loading into symengine environment...
#> → pruning branches (`if`/`else`) of saem model...
#> ✔ done
#> → finding duplicate expressions in saem predOnly model 0...
#> [====|====|====|====|====|====|====|====|====|====] 0:00:00
#> → finding duplicate expressions in saem predOnly model 1...
#> [====|====|====|====|====|====|====|====|====|====] 0:00:00
#> → optimizing duplicate expressions in saem predOnly model 1...
#> [====|====|====|====|====|====|====|====|====|====] 0:00:00
#> → finding duplicate expressions in saem predOnly model 2...
#> [====|====|====|====|====|====|====|====|====|====] 0:00:00
#> ✔ done
#> → Calculating residuals/tables
#> ✔ done
#> → compress origData in nlmixr2 object, save 43024
#> → compress phiM in nlmixr2 object, save 315168
#> → compress parHistData in nlmixr2 object, save 17080
m1 <- mutate(m1,TRT=dftemplate$TRT[is.na(dftemplate$AMT)])

plot(augPred(m1))


ggplot(m1,aes(TIME,IPRED,group=ID))+geom_line()+geom_point(aes(y=DV))+facet_wrap(~TRT)


## focei
m2 <- nlmixr2(basemodel1,df,"focei",control=foceiControl(covMethod = "s", print=0,
                                                         outerOpt="bobyqa"))
#> unhandled error message: EE:lsoda -- at t = 18.1988 and step size _rxC(h) = 2.40561e-12, the
#>  corrector convergence failed repeatedly or
#>  with fabs(_rxC(h)) = hmin
#>  @(lsoda.c:893
#> calculating covariance matrix
#> [====|====|====|====|====|====|====|====|====|====] 0:00:00 
#> done
#> → Calculating residuals/tables
#> ✔ done
#> → compress origData in nlmixr2 object, save 43024
#> → compress parHistData in nlmixr2 object, save 14416

plot(augPred(m2))


m2 <- mutate(m2,TRT=dftemplate$TRT[is.na(dftemplate$AMT)])
ggplot(m2,aes(TIME,IPRED,group=ID))+geom_line()+geom_point(aes(y=DV))+facet_wrap(~TRT)

Created on 2024-07-18 with reprex v2.1.1

This shows predictions that are more reasonable.

mattfidler commented 1 month ago

@PyryValitalo hopefully this gives you a way forward while I refactor the linear compartmental solutions.

PyryValitalo commented 1 month ago

Thanks, this helps forward, and apologies if I sounded impatient in the opening post. There is no rush on my side, I am using nlmixr2 for a hobby project with no deadline.

I agree that the predictions are more reasonable for focei family of models (m0 and m2), but I guess we agree that the SAEM predictions (m1) still seem off, given that we are fitting the correct model to a dataset with an adequate number of observations. By "correct model" i mean that the data-generating and data-analytic models are identical.

68747470733a2f2f692e696d6775722e636f6d2f7939637a5a65572e706e67

PyryValitalo commented 1 month ago

Also, I confirm that using the ODE solution seems to be the critical step in getting focei algorithms to work as expected.

Side comments:

Thanks for the info on expit and logit. Thanks also for the pointer on being able to use lnorm() to specify lognormal distribution. To clarify: Is the use of lnorm() in nlmixr2 similar to specifying Y=F*EXP(EPS(1)) in NONMEM, which in the case of NONMEM would result in applying a first-order taylor expansion on the residual error, and the end result would be the same as specifying a proportional error modelY=F*(1+EPS(1)) ?

Or is the use of lnorm() in nlmixr2 similar to using a "dynamic transform both sides" approach as outlined by Dosne et al (https://doi.org/10.1007%2Fs10928-015-9460-y) with the relevant parameters fixed so that the resulting transformation is logarithm transformation? I ask because I think the end result is different between the two approaches, with the "dynamic transform both sides" approach corresponding to the true lognormal distribution.

Finally, just want to mention: Thanks to the whole nlmixr2 team for making the effort; giving out, maintaining and documenting an open source tool for pharmacometric modelling.

mattfidler commented 1 month ago

Thanks, this helps forward, and apologies if I sounded impatient in the opening post. There is no rush on my side, I am using nlmixr2 for a hobby project with no deadline.

This no problem from my end. I simply said that since it took me awhile to get to the response, not that I thought you were impatient in the least :smile:

I guess we agree that the SAEM predictions (m1) still seem off

You likely need to change the default options in terms of the number of iterations in the first and second phase of the algorithm. You could also increase the number of chains used to increase the accuracy of saem.

Or is the use of lnorm() in nlmixr2 similar to using a "dynamic transform both sides"

Yes, it is similar to the dynamic transformation of both sides, though instead of the Box-Cox with a fixed lambda value, it actually uses log(). That being said, nlmixr2 also support boxCox() and yeoJohnson() which are also both done with dynamic transformation of both sides.

You can see all the residual options (including t() for focei) here:

https://nlmixr2.github.io/rxode2/articles/rxode2-syntax.html#residual-functions-when-using-rxode2-functions

Thanks to the whole nlmixr2 team for making the effort; giving out, maintaining and documenting an open source tool for pharmacometric modelling.

You are welcome. I of course appreciate the rest of the nlmixr2 team and the time the companies give for them to help with this project. I also appreciate Novartis use some company time to work on this project.

PyryValitalo commented 1 month ago

I still think there exists a problem also for SAEM estimation. I tried increasing the number of iterations and chains, and relaxing the initial estimates of OMEGA to allow better initial search. The problem persists.

m1 <- basemodel1 %>%
  ini({eta.cl1 ~ 0.9
    eta.vc ~ 0.9
    eta.ka ~ 0.9
    eta.f ~0.9}) %>%
  nlmixr2(df,"saem",control=saemControl(covMethod = "s",
                                        nBurn = 2000, ## default 200
                                        nEm = 3000, ## default 300
                                        nmc = 6,)) ## default 3 
m1 <- mutate(m1,TRT=dftemplate$TRT[is.na(dftemplate$AMT)])
ggplot(m1,aes(TIME,IPRED,group=ID))+geom_line()+geom_point(aes(y=DV))+facet_wrap(~TRT)

image

On the other hand, running the model with saemix package does produce reasonable predictions:

library(saemix)
predder <- function(psi,id,xidep) {
  dose <- xidep[,2]
  ti <- xidep[,1]
  trt <- xidep[,3]
  cl=psi[id,1];v1=psi[id,2];ka=psi[id,3];f=psi[id,4]
  k=cl/v1
  po <- f*dose*ka/(ka-k)/v1*(exp(-k*ti)-exp(-ka*ti))
  iv <- dose/v1*exp(-k*ti)
  ifelse(trt==1,po,iv)
}
dftemp <- filter(df,is.na(AMT))
saemixdf <- saemixData(name.data=dftemp,
                       name.group=c("id"),name.predictors=c("time","dose","TRT"),
                       name.response=c("DV"),name.covariates=c("logwt"))
saemix_m1 <- saemixModel(model=predder,modeltype="structural",
                         psi0=matrix(c(2.1,25,.8,.8,.75,1,0,0),ncol=4, byrow=TRUE,
                                     dimnames=list(NULL, c("cl","v1","ka","f"))),
                         transform.par=c(1,1,1,3),
                         covariate.model=matrix(c(1,1,0,0),ncol=4,byrow=TRUE),
                         covariance.model=diag((rep(1,4))),
                         omega.init=diag((rep(.25,4))),
                         error.model="exponential")
m1s <- saemix(saemix_m1,saemixdf)
plot(m1s,plot.type="vpc")

The VPC looks like this (i didn't know how to stratify the VPC in saemix package, so there is only a single VPC. Nevertheless, it can be seen that the observations and simulated values match, as one would expect).

image

To conclude, the saemix model successfully converges to reasonable estimates, whereas the nlmixr2 with SAEM estimation does not.

mattfidler commented 1 month ago

Hi @PyryValitalo

I am unclear if this is due to rxode2 or simply saem from nlmixr2est.

You are using solved systems in saemix, so to try to figure this out a bit more, I will be working in solved systems in rxode2/nlmixr2 so I can debug this faster and easier.

While there may not be alot of activity on this for awhile, I will be working on the components needed to understand this a bit more.

mattfidler commented 1 month ago

I will also transfer this to nlmixr2est since that is where the saem algorithm is located.