nlmixrdevelopment / nlmixr

nlmixr: an R package for population PKPD modeling
https://nlmixrdevelopment.github.io/nlmixr/
GNU General Public License v2.0
115 stars 45 forks source link

very different SAEM results and other tutorial issues #29

Closed JSC67 closed 6 years ago

JSC67 commented 6 years ago

The tutorial (https://github.com/nlmixrdevelopment/nlmixr/blob/master/vignettes/running_nlmixr.Rmd) has code for compiling a model for closed form solution using linCmt() and for ODE solution, and also shows how one might use NMLE and SAEM algorithms. With the single dose theophylline data at hand, this gives 4 possibilities. I followed along the tutorial but made a few changes for clarity and for comparison and found that the NMLE result for closed form and for ODE agree very well (but not exactly), whereas the SAEM objective function values, AIC, BIC, and log-likelihood are very different for the explicit versus the ODE models and do not agree with the NMLE results either. The SAEM results seem to be way off. Below is the code and results (and by the way, is there a good way to make all of the following pasted text appear in a uniform typewriter font?): tutorial.txt tutorial output.txt

`The tutorial (https://github.com/nlmixrdevelopment/nlmixr/blob/master/vignettes/running_nlmixr.Rmd) has code for compiling a model for closed form solution using linCmt() and for ODE solution, and also shows how one might use NMLE and SAEM algorithms. With the single dose theophylline data at hand, this gives 4 possibilities. I followed along the tutorial but made a few changes for clarity and for comparison and found that the NMLE result for closed form and for ODE agree very well (but not exactly), whereas the SAEM objective function values, AIC, BIC, and log-likelihood are very different for the explicit versus the ODE models and do not agree with the NMLE results either. The SAEM results seem to be way off. Below is the code and results:

R version 3.4.3 (2017-11-30) -- "Kite-Eating Tree"
Copyright (C) 2017 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> setwd("C:/Users/James/Documents/pharmacometrics/practice/nlmixr")
> #https://github.com/nlmixrdevelopment/nlmixr/blob/master/vignettes/running_nlmixr.Rmd
> 
> # Preliminaries
> library(ggplot2)
> library(nlmixr)
Loading required package: nlme
Loading required package: RxODE
> str(theo_sd)
'data.frame':   144 obs. of  7 variables:
 $ ID  : int  1 1 1 1 1 1 1 1 1 1 ...
 $ TIME: num  0 0 0.25 0.57 1.12 2.02 3.82 5.1 7.03 9.05 ...
 $ DV  : num  0 0.74 2.84 6.57 10.5 9.66 8.58 8.36 7.47 6.89 ...
 $ AMT : num  4.02 0 0 0 0 0 0 0 0 0 ...
 $ EVID: int  101 0 0 0 0 0 0 0 0 0 ...
 $ CMT : int  1 2 2 2 2 2 2 2 2 2 ...
 $ WT  : num  79.6 79.6 79.6 79.6 79.6 79.6 79.6 79.6 79.6 79.6 ...
> ggplot(theo_sd, aes(TIME, DV)) +
+   geom_line(aes(group=ID), col="red") + 
+   scale_x_continuous("Time (h)") +
+   scale_y_continuous("Concentration") +
+   labs(title="Theophylline single-dose",
+        subtitle="Concentration vs. time by individual")
> # Try fitting a simple one-compartment PK model to this small dataset.
> # 'uif' for Unified Interface Function, lin for linCmt()
> uif.lin <- function() {
+   ini({
+     logtcl <- -3.2      # log typical value Cl (L/hr)
+     logtka <-  0.5      # log typical value Ka (1/hr)
+     logtv  <- -1        # log typical value V  (L)
+     
+     # error model
+     add.err <- 0.1
+     
+     # Initial estimates for IIV variances
+     # Labels work for single parameters.
+     eta.cl ~ 2
+     eta.ka ~ 1    ## BSV Ka
+     eta.v  ~ 1
+   })
+   model({
+     cl <- exp(logtcl + eta.cl)
+     ka <- exp(logtka + eta.ka)
+     v  <- exp(logtv  + eta.v)
+     linCmt() ~ add(add.err)
+   })
+ }
> # We can alternatively express the same model by ODEs:
> uif.ode <- function() {
+   ini({
+     logtcl <- -3.2      # log typical value Cl (L/hr)
+     logtka <-  0.5      # log typical value Ka (1/hr)
+     logtv  <- -1        # log typical value V  (L)
+     
+     # error model
+     add.err <- 0.1
+     
+     # Initial estimates for IIV variances
+     # Labels work for single parameters.
+     eta.cl ~ 2
+     eta.ka ~ 1    ## BSV Ka
+     eta.v  ~ 1
+   })
+   model({
+     cl <- exp(logtcl + eta.cl)
+     ka <- exp(logtka + eta.ka)
+     v  <- exp(logtv  + eta.v)
+     d/dt(depot)  = -ka * depot
+     d/dt(center) =  ka * depot - cl / v * center
+     cp = center / v
+     cp ~ add(add.err)
+   })
+ }
> # NLME fittings
> 
> fit.nlme.lin <- nlmixr(uif.lin, theo_sd, est="nlme", calc.resid=FALSE)

**Iteration 1
LME step: Loglik: -182.2318, nlminb iterations: 1
reStruct  parameters:
      ID1       ID2       ID3 
1.0022617 0.2783009 2.0758984 
PNLS step: RSS =  63.14922 
 fixed effects: -3.211937  0.4479979  -0.7859318  
 iterations: 7 
Convergence crit. (must all become <= tolerance = 1e-05):
   fixed reStruct 
0.272375 3.267309 

**Iteration 2
LME step: Loglik: -179.291, nlminb iterations: 9
reStruct  parameters:
       ID1        ID2        ID3 
0.96385527 0.08333435 1.63552295 
PNLS step: RSS =  63.28221 
 fixed effects: -3.211535  0.4441505  -0.7863391  
 iterations: 7 
Convergence crit. (must all become <= tolerance = 1e-05):
      fixed    reStruct 
0.008662381 0.172135330 

**Iteration 3
LME step: Loglik: -179.3381, nlminb iterations: 8
reStruct  parameters:
       ID1        ID2        ID3 
0.96129928 0.07085834 1.63885629 
PNLS step: RSS =  63.2196 
 fixed effects: -3.211732  0.4458486  -0.7861743  
 iterations: 7 
Convergence crit. (must all become <= tolerance = 1e-05):
      fixed    reStruct 
0.003808629 0.082919683 

**Iteration 4
LME step: Loglik: -179.3201, nlminb iterations: 7
reStruct  parameters:
       ID1        ID2        ID3 
0.96243507 0.07618717 1.63736945 
PNLS step: RSS =  63.22991 
 fixed effects: -3.211823  0.4451494  -0.7862432  
 iterations: 7 
Convergence crit. (must all become <= tolerance = 1e-05):
      fixed    reStruct 
0.001570583 0.013884126 

**Iteration 5
LME step: Loglik: -179.3277, nlminb iterations: 4
reStruct  parameters:
       ID1        ID2        ID3 
0.96197111 0.07396744 1.63796900 
PNLS step: RSS =  63.24112 
 fixed effects: -3.211722  0.4454457  -0.7862145  
 iterations: 6 
Convergence crit. (must all become <= tolerance = 1e-05):
       fixed     reStruct 
0.0006651255 0.0147544693 

**Iteration 6
LME step: Loglik: -179.3246, nlminb iterations: 1
reStruct  parameters:
       ID1        ID2        ID3 
0.96215433 0.07487426 1.63772430 
PNLS step: RSS =  63.24112 
 fixed effects: -3.211722  0.4454457  -0.7862145  
 iterations: 1 
Convergence crit. (must all become <= tolerance = 1e-05):
       fixed     reStruct 
0.000000e+00 3.910435e-10 
Warning message:
In nlmixrUI.nlme.var(obj) :
  Initial condition for additive error ignored with nlme
> fit.nlme.ode <- nlmixr(uif.ode, theo_sd, est="nlme", calc.resid=FALSE)

**Iteration 1
LME step: Loglik: -182.2318, nlminb iterations: 1
reStruct  parameters:
      ID1       ID2       ID3 
1.0022618 0.2783008 2.0759012 
PNLS step: RSS =  63.24481 
 fixed effects: -3.211675  0.4451503  -0.7862385  
 iterations: 7 
Convergence crit. (must all become <= tolerance = 1e-05):
    fixed  reStruct 
0.2718787 2.7010441 

**Iteration 2
LME step: Loglik: -179.3267, nlminb iterations: 6
reStruct  parameters:
       ID1        ID2        ID3 
0.96197025 0.07396818 1.63806766 
PNLS step: RSS =  63.23773 
 fixed effects: -3.211876  0.4453302  -0.7862234  
 iterations: 2 
Convergence crit. (must all become <= tolerance = 1e-05):
       fixed     reStruct 
0.0004040574 0.0088556330 

**Iteration 3
LME step: Loglik: -179.3253, nlminb iterations: 1
reStruct  parameters:
       ID1        ID2        ID3 
0.96208503 0.07453515 1.63782577 
PNLS step: RSS =  63.23773 
 fixed effects: -3.211876  0.4453302  -0.7862234  
 iterations: 1 
Convergence crit. (must all become <= tolerance = 1e-05):
       fixed     reStruct 
0.000000e+00 3.183879e-10 
Warning message:
In nlmixrUI.nlme.var(obj) :
  Initial condition for additive error ignored with nlme
> # SAEM fittings
> 
> # For SAEM, it's nice to avoid printing each C++ iteration to the monitor.
> sink("saem output")
> fit.saem.lin <- nlmixr(uif.lin, theo_sd, est="saem")  # works with spaces in path
Compiling SAEM user function...C:/RTools/3.4/mingw_64/bin/g++  -I"C:/R/R-34~1.3/include" -DNDEBUG       -IC:/Users/James/DOCUME~1/R/WIN-LI~1/3.4/nlmixr/include -IC:/Users/James/DOCUME~1/R/WIN-LI~1/3.4/STANHE~1/include -IC:/Users/James/DOCUME~1/R/WIN-LI~1/3.4/Rcpp/include -IC:/Users/James/DOCUME~1/R/WIN-LI~1/3.4/RCPPAR~1/include -IC:/Users/James/DOCUME~1/R/WIN-LI~1/3.4/RCPPEI~1/include -IC:/Users/James/DOCUME~1/R/WIN-LI~1/3.4/BH/include   -O2 -Wall  -mtune=generic -c saem137829e957ffx64.cpp -o saem137829e957ffx64.o
C:/RTools/3.4/mingw_64/bin/g++ -shared -s -static-libgcc -o saem137829e957ffx64.dll tmp.def saem137829e957ffx64.o -LC:/R/R-34~1.3/bin/x64 -lRblas -LC:/R/R-34~1.3/bin/x64 -lRlapack -lgfortran -lm -lquadmath -LC:/R/R-34~1.3/bin/x64 -lR
done.
Using sympy via SnakeCharmR
C:/RTools/3.4/mingw_64/bin/gcc  -I"C:/R/R-34~1.3/include" -DNDEBUG          -O2 -Wall  -std=gnu99 -mtune=generic -c rx_920cdb7f8c8c7b5e42b1a8db66863398_x64.c -o rx_920cdb7f8c8c7b5e42b1a8db66863398_x64.o
C:/RTools/3.4/mingw_64/bin/gcc -shared -s -static-libgcc -o rx_920cdb7f8c8c7b5e42b1a8db66863398_x64.dll tmp.def rx_920cdb7f8c8c7b5e42b1a8db66863398_x64.o -LC:/R/R-34~1.3/bin/x64 -lRblas -LC:/R/R-34~1.3/bin/x64 -lRlapack -lgfortran -lm -lquadmath -LC:/R/R-34~1.3/bin/x64 -lR
The model has been setup to calculate residuals.
It will be cached for future runs.
Calculating Table Variables...
done
nlmixr UI combined dataset and properties
 $ par.hist         : Parameter history (if available)
 $ par.hist.stacked : Parameter history in stacked form for easy plotting (if available)
 $ par.fixed        : Fixed Effect Parameter Table
 $ eta              : Individual Parameter Estimates
 $ seed             : Seed (if applicable)
 $ model.name       : Model name (from R function)
 $ data.name        : Name of R data input
> # Using RStudio, in Environment tab, what is object curi with value 0 has appeared.
> # What is curi?
> # NB: The following won't work with spaces in file full file path, don't know why:
> fit.saem.ode <- nlmixr(uif.ode, theo_sd, est="saem")  # fails with spaces in path
Compiling RxODE differential equations...done.
C:/RTools/3.4/mingw_64/bin/g++  -I"C:/R/R-34~1.3/include" -DNDEBUG       -IC:/Users/James/DOCUME~1/R/WIN-LI~1/3.4/nlmixr/include -IC:/Users/James/DOCUME~1/R/WIN-LI~1/3.4/STANHE~1/include -IC:/Users/James/DOCUME~1/R/WIN-LI~1/3.4/Rcpp/include -IC:/Users/James/DOCUME~1/R/WIN-LI~1/3.4/RCPPAR~1/include -IC:/Users/James/DOCUME~1/R/WIN-LI~1/3.4/RCPPEI~1/include -IC:/Users/James/DOCUME~1/R/WIN-LI~1/3.4/BH/include   -O2 -Wall  -mtune=generic -c saem13783d7445b6x64.cpp -o saem13783d7445b6x64.o
C:/RTools/3.4/mingw_64/bin/g++ -shared -s -static-libgcc -o saem13783d7445b6x64.dll tmp.def saem13783d7445b6x64.o C:/Users/James/Documents/pharmacometrics/practice/nlmixr/rx_7cbddd1a680b22377f3ee375ef2ddda8_x64.dll -LC:/R/R-34~1.3/bin/x64 -lRblas -LC:/R/R-34~1.3/bin/x64 -lRlapack -lgfortran -lm -lquadmath -LC:/R/R-34~1.3/bin/x64 -lR
done.
## Calculate ETA-based prediction and error derivatives:
Calculate Jacobian...................done.
Calculate sensitivities.......
done.
## Calculate d(f)/d(eta) 
## ...
## done
## ...
## done
C:/RTools/3.4/mingw_64/bin/gcc  -I"C:/R/R-34~1.3/include" -DNDEBUG          -O2 -Wall  -std=gnu99 -mtune=generic -c rx_32a3413ee26a7ebdb0539062d9fc430e_x64.c -o rx_32a3413ee26a7ebdb0539062d9fc430e_x64.o
C:/RTools/3.4/mingw_64/bin/gcc -shared -s -static-libgcc -o rx_32a3413ee26a7ebdb0539062d9fc430e_x64.dll tmp.def rx_32a3413ee26a7ebdb0539062d9fc430e_x64.o -LC:/R/R-34~1.3/bin/x64 -lRblas -LC:/R/R-34~1.3/bin/x64 -lRlapack -lgfortran -lm -lquadmath -LC:/R/R-34~1.3/bin/x64 -lR
C:/RTools/3.4/mingw_64/bin/gcc  -I"C:/R/R-34~1.3/include" -DNDEBUG          -O2 -Wall  -std=gnu99 -mtune=generic -c rx_a2baf9a7f76ed6696b7c9c5507ef49d7_x64.c -o rx_a2baf9a7f76ed6696b7c9c5507ef49d7_x64.o
C:/RTools/3.4/mingw_64/bin/gcc -shared -s -static-libgcc -o rx_a2baf9a7f76ed6696b7c9c5507ef49d7_x64.dll tmp.def rx_a2baf9a7f76ed6696b7c9c5507ef49d7_x64.o -LC:/R/R-34~1.3/bin/x64 -lRblas -LC:/R/R-34~1.3/bin/x64 -lRlapack -lgfortran -lm -lquadmath -LC:/R/R-34~1.3/bin/x64 -lR
The model-based sensitivities have been calculated.
It will be cached for future runs.
Calculating Table Variables...
done
nlmixr UI combined dataset and properties
 $ par.hist         : Parameter history (if available)
 $ par.hist.stacked : Parameter history in stacked form for easy plotting (if available)
 $ par.fixed        : Fixed Effect Parameter Table
 $ eta              : Individual Parameter Estimates
 $ seed             : Seed (if applicable)
 $ model.name       : Model name (from R function)
 $ data.name        : Name of R data input
> sink()
> # Now compare fits for explicit (i.e., linCmt() ) model vs. ODE, and by algorithms:
> fit.nlme.lin  # Log-likelihood: -179.3246
Nonlinear mixed-effects model fit by maximum likelihood
  Model: DV ~ (nlmeModList("user_fn"))(logtcl, eta.cl, logtka, eta.ka,      logtv, eta.v, TIME, ID) 
  Log-likelihood: -179.3246
  Fixed: logtcl + logtka + logtv ~ 1 
    logtcl     logtka      logtv 
-3.2117217  0.4454457 -0.7862145 

Random effects:
 Formula: list(eta.cl ~ 1, eta.ka ~ 1, eta.v ~ 1)
 Level: ID
 Structure: Diagonal
           eta.cl    eta.ka    eta.v  Residual
StdDev: 0.2644566 0.6422369 0.134573 0.6921699

Number of Observations: 132
Number of Groups: 12 
> fit.nlme.ode  # Log-likelihood: -179.3253
Nonlinear mixed-effects model fit by maximum likelihood
  Model: DV ~ (nlmeModList("user_fn"))(logtcl, eta.cl, logtka, eta.ka,      logtv, eta.v, TIME, ID) 
  Log-likelihood: -179.3253
  Fixed: logtcl + logtka + logtv ~ 1 
    logtcl     logtka      logtv 
-3.2118758  0.4453302 -0.7862234 

Random effects:
 Formula: list(eta.cl ~ 1, eta.ka ~ 1, eta.v ~ 1)
 Level: ID
 Structure: Diagonal
           eta.cl    eta.ka     eta.v  Residual
StdDev: 0.2644679 0.6424376 0.1345558 0.6921515

Number of Observations: 132
Number of Groups: 12 
> fit.saem.lin  # Log-likelihood: -148.4444
nlmixr SAEM fit (Solved)

     OBJF      AIC      BIC Log-likelihood
 296.8482 310.8482 331.0278      -148.4241

Time (sec; $time):
        saem setup Likelihood Calculation covariance table
elapsed 53.9 65.11                   2.09          0   1.5

Parameters ($par.fixed):
                          Parameter Estimate     SE    CV Untransformed
logtcl  log typical value Cl (L/hr)    -3.22 0.0816  2.5%        0.0401
logtka  log typical value Ka (1/hr)    0.450  0.194 43.0%          1.57
logtv      log typical value V  (L)   -0.784 0.0435  5.6%         0.457
add.err                     add.err    0.692                      0.692
                 (95%CI)
logtcl  (0.0342, 0.0471)
logtka      (1.07, 2.29)
logtv     (0.419, 0.497)
add.err                 

Omega ($omega):
          eta.cl    eta.ka      eta.v
eta.cl 0.0696184 0.0000000 0.00000000
eta.ka 0.0000000 0.4187747 0.00000000
eta.v  0.0000000 0.0000000 0.01819676

Fit Data (object is a modified data.frame):
# A tibble: 132 x 15
  ID     TIME    DV IPRED  PRED   IRES    RES  IWRES     WRES  CWRES   CPRED     CRES
  <fct> <dbl> <dbl> <dbl> <dbl>  <dbl>  <dbl>  <dbl>    <dbl>  <dbl>   <dbl>    <dbl>
1 1     0     0.740  0     0     0.740 0.740   1.07   2.06e-5 -0.999  4.81e4 -4.81e+4
2 1     0.250 2.84   3.85  2.82 -1.01  0.0178 -1.46   2.57e-2 -1.46   3.85e0 -1.01e+0
3 1     0.570 6.57   6.76  5.06 -0.191 1.51   -0.276  2.19e+0 -0.276  6.76e0 -1.91e-1
# ... with 129 more rows, and 3 more variables: eta.cl <dbl>, eta.ka <dbl>,
#   eta.v <dbl>
> fit.saem.ode  # Log-likelihood:  -58.07448
nlmixr SAEM fit (ODE)

    OBJF     AIC      BIC Log-likelihood
 116.149 130.149 150.3286      -58.07448

Time (sec; $time):
         saem setup Likelihood Calculation covariance table
elapsed 49.99 17.71                   0.28          0  1.18

Parameters ($par.fixed):
                          Parameter Estimate     SE    CV Untransformed
logtcl  log typical value Cl (L/hr)    -3.22 0.0818  2.5%        0.0400
logtka  log typical value Ka (1/hr)    0.458  0.192 41.9%          1.58
logtv      log typical value V  (L)   -0.782 0.0437  5.6%         0.458
add.err                     add.err    0.694                      0.694
                 (95%CI)
logtcl  (0.0341, 0.0470)
logtka      (1.09, 2.30)
logtv     (0.420, 0.499)
add.err                 

Omega ($omega):
           eta.cl    eta.ka      eta.v
eta.cl 0.07190448 0.0000000 0.00000000
eta.ka 0.00000000 0.4174639 0.00000000
eta.v  0.00000000 0.0000000 0.01818897

Fit Data (object is a modified data.frame):
# A tibble: 132 x 15
  ID     TIME    DV IPRED  PRED   IRES     RES  IWRES    WRES  CWRES CPRED  CRES
  <fct> <dbl> <dbl> <dbl> <dbl>  <dbl>   <dbl>  <dbl>   <dbl>  <dbl> <dbl> <dbl>
1 1     0     0.740  0     0     0.740 0.740    1.07  1.07    1.07    0    0.740
2 1     0.250 2.84   3.90  2.83 -1.06  0.00644 -1.53  0.00382 0.0813  2.66 0.177
3 1     0.570 6.57   6.83  5.07 -0.255 1.50    -0.368 0.675   0.632   4.83 1.74 
# ... with 129 more rows, and 3 more variables: eta.cl <dbl>, eta.ka <dbl>,
#   eta.v <dbl>
> # Why are they so very different?
> # Why does object x appear (1 obs. of 4 variables) appear after fit.saem.ode only?
> 
> # Suggestion: Use same printing format for models with either algorithm.`
mattfidler commented 6 years ago

It was very difficult for me to read this since you pasted your R session as output in the comment. I edited the comment so I could read it. I hope you don't mind. In the future please use ```R on the first line and then close it with ``` on the last line to enclose R code.

Also I'm unsure exactly what your questions were; I think these are your questions:

Why are the liklihoods different between nlme and SAEM?

Each method makes it own likelhood approximation. The nlme approximation is not the same as the FOCEi approximation. The SAEM algorithm stotastically draws from the probabilitiy of the model and then uses for its maximizing the stoastic probability (See https://www.math.u-bordeaux.fr/MAS10/DOC/PDF-PRES/Lavielle.pdf). Therefore, the conventional wisdom is that liklihood between two different methods should not be compared because the approxmations are different and the optimizations between the algorithms are different. In fact for nlme when using REML you cannot really even compare two models at all (See https://stats.stackexchange.com/questions/116770/reml-or-ml-to-compare-two-mixed-effects-models-with-differing-fixed-effects-but)

Why are the parameter estimates different?

This is because of the different algorithms used to optimize the problem.

Why are the objects different?

This is manily due to a CRAN limitation on running vignettes in a timely manner and having software that is available on CRAN machines. I don't believe they have a recent python or sympy installation on those systems.

The changing of a traditional nlme object to a nlmixr-style option is controlled by the calc.resid option. If you decide to turn it on by calc.resid=TRUE (which is the default) or even drop it all together, you get much closer objects. Running the solved system gives:

> uif <- function() {
    ini({
        tka <- .5
        tcl <- -3.2
        tv <- -1
        eta.ka ~ 1
        eta.cl ~ 2
        eta.v ~ 1
        add.err <- 0.1
    })
    model({
        ka <- exp(tka + eta.ka)
        cl <- exp(tcl + eta.cl)
        v <- exp(tv + eta.v)
        linCmt() ~ add(add.err)
    })
}
> fit <- nlmixr(uif, theo_sd, est="nlme")

**Iteration 1
LME step: Loglik: -182.2318, nlminb iterations: 1
reStruct  parameters:
      ID1       ID2       ID3 
0.2783009 1.0022617 2.0758984 
PNLS step: RSS =  63.14918 
 fixed effects: 0.4479976  -3.211924  -0.785932  
 iterations: 7 
Convergence crit. (must all become <= tolerance = 1e-05):
    fixed  reStruct 
0.2723747 3.2674463 
... And even more output
> fit
nlmixr nlme fit by maximum likelihood (Solved)

FOCEi-based goodness of fit metrics:
     OBJF      AIC      BIC Log-likelihood
 296.5985 310.5985 330.7781      -148.2992

nlme-based goodness of fit metrics:
      AIC      BIC Log-likelihood
 372.6488 392.8285      -179.3244

Time (sec; $time):
        nlme setup FOCEi Evaulate covariance table
elapsed 2.17 21.44            0.5          0  0.96

Parameters ($par.fixed):
        Estimate     SE    CV Untransformed          (95%CI)
tka        0.445  0.198 44.4%          1.56     (1.06, 2.30)
tcl        -3.21 0.0845  2.6%        0.0403 (0.0341, 0.0475)
tv        -0.786 0.0463  5.9%         0.456   (0.416, 0.499)
add.err    0.692                      0.692                 

Omega ($omega):
         eta.ka    eta.cl     eta.v
eta.ka 0.412483 0.0000000 0.0000000
eta.cl 0.000000 0.0699386 0.0000000
eta.v  0.000000 0.0000000 0.0181094

Fit Data (object is a modified data.frame):
# A tibble: 132 x 15
  ID     TIME    DV IPRED  PRED   IRES    RES  IWRES      WRES  CWRES    CPRED
  <fct> <dbl> <dbl> <dbl> <dbl>  <dbl>  <dbl>  <dbl>     <dbl>  <dbl>    <dbl>
1 1     0     0.740  0     0     0.740 0.740   1.07  0.0000207 -1.00  48128   
2 1     0.250 2.84   3.86  2.82 -1.02  0.0218 -1.47  0.0316    -1.47      3.86
3 1     0.570 6.57   6.78  5.05 -0.207 1.52   -0.299 2.19      -0.299     6.78
# ... with 129 more rows, and 4 more variables: CRES <dbl>, eta.ka <dbl>,
#   eta.cl <dbl>, eta.v <dbl>
> 

Note that:

Hopefully this answers your questions.

JSC67 commented 6 years ago

Hi Matt,

Thank-you for poring over my question; I'm sorry it was hard to decipher. I don't mind in the least that you re-worded it for clarity for your sake. Thank-you for the R and to demarkate R output; I looked and looked and didn't see such instructions on Github. Now I know.

Your understanding of my questions was as follows:

Why do the likelihoods differ between nlme and saem? Why are the parameter estimates different between the different estimation methods? Why are the objects and outputs different between the two algoritnms?

This is close but not quite. Below are my questions more clearly laid out:

(1) Why do the likelihoods differ so much between saem and nlme? (2) Why do the likelihoods differ so much between saem using ODEs versus using solved (i.e., linCmt() )?

Regarding question (1), I understand that different algorithms will give different answers for likelihoods, parameter estimates, and so on, but I am a bit surprised by just how different they are between SAEM, which is an approximate solution to an exact equation, and NLME, which is an "exact" solution (subject to floating point math rounding errors) to an approximate equation to what is implied by the model. Perhaps I shouldn't be quite so surprised by a 68% discrepancy, but I am. I'd expect that for such a simple problem that the discrepancy between algorithms would be maybe 10-15%, but from your example I guess I'm being naive since calc.resid=TRUE showed both nlme and FOCEI values, the latter being better and closely agreeing with SAEM. If there is a correct answer to the question of what is the likelihood for a given model, data set combination (and there is, barring some strange pathological cases), then I'd expect it to be better approximated by a more or less decent approximation to the desired model than by a more exact answer to an almost-right model. As long as a correct value for the likelihood exists, then the conventional wisdom about not comparing values from different algorithms rings hollow to me, because comparing such values would be a measure of algorithmic error if the correct answer is known to very good approximation, and if a highly accurate gold-standard algorithm is not available, then the situation becomes like statistically summarizing inter-rater variability, e.g., among subjective raters for figure skaters. The correct value does exist, and so we should be able to say something about algorithmic errors.

Regarding question (2), I am even more surprised that, using ostensibly the same algorithm (which yes, I realize uses stochastic approximation) and data set, one would get answers that differ by a factor of 2.6 depending on whether one is solving ODEs or using the ODE solution directly. This isn't a difficult ODE, so why a 2.6-fold difference in answers? Using nlme and ODEs vs. the ODE solution, the difference in likelihoods is very small: -179.3253 versus -179.3246, which I can chalk up to minor round off error, probably more so in the ODE vs. in the direct solution, but for the SAEM algorithm the values are -58.07448 versus -148.4241, which is just so much bigger a difference that I cannot wave it away.

Also some minor tangential questions:

(3) What is it about saem fitting with ODEs that's different from SAEM fitting with solved equations? When I tried the latter in a folder that had spaces in its path, it worked, but when trying the same with ODEs in the same folder, it didn't work. (I didn't include the output from this, and for the purpose of my question I re-did things first in a folder without spaces; this was my first command in what I did show.) This seems somehow related to the C++ compiler as this output shows:

> getwd()
[1] "C:/Users/James/Documents"
> setwd("C:\\Users\\James\\Documents\\academic\\conceptual sciences")
> # SAEM fittings
> 
> # For SAEM, it's nice to avoid printing each C++ iteration to the monitor.
> sink("saem output")
> 
> fit.saem.lin <- nlmixr(uif.lin, theo_sd, est="saem")  # works with spaces in path
Compiling SAEM user function...C:/RTools/3.4/mingw_64/bin/g++  -I"C:/R/R-34~1.3/include" -DNDEBUG       -IC:/Users/James/DOCUME~1/R/WIN-LI~1/3.4/nlmixr/include -IC:/Users/James/DOCUME~1/R/WIN-LI~1/3.4/STANHE~1/include -IC:/Users/James/DOCUME~1/R/WIN-LI~1/3.4/Rcpp/include -IC:/Users/James/DOCUME~1/R/WIN-LI~1/3.4/RCPPAR~1/include -IC:/Users/James/DOCUME~1/R/WIN-LI~1/3.4/RCPPEI~1/include -IC:/Users/James/DOCUME~1/R/WIN-LI~1/3.4/BH/include   -O2 -Wall  -mtune=generic -c saemf68125d2995x64.cpp -o saemf68125d2995x64.o
C:/RTools/3.4/mingw_64/bin/g++ -shared -s -static-libgcc -o saemf68125d2995x64.dll tmp.def saemf68125d2995x64.o -LC:/R/R-34~1.3/bin/x64 -lRblas -LC:/R/R-34~1.3/bin/x64 -lRlapack -lgfortran -lm -lquadmath -LC:/R/R-34~1.3/bin/x64 -lR
done.
Using sympy via SnakeCharmR
C:/RTools/3.4/mingw_64/bin/gcc  -I"C:/R/R-34~1.3/include" -DNDEBUG          -O2 -Wall  -std=gnu99 -mtune=generic -c rx_920cdb7f8c8c7b5e42b1a8db66863398_x64.c -o rx_920cdb7f8c8c7b5e42b1a8db66863398_x64.o
C:/RTools/3.4/mingw_64/bin/gcc -shared -s -static-libgcc -o rx_920cdb7f8c8c7b5e42b1a8db66863398_x64.dll tmp.def rx_920cdb7f8c8c7b5e42b1a8db66863398_x64.o -LC:/R/R-34~1.3/bin/x64 -lRblas -LC:/R/R-34~1.3/bin/x64 -lRlapack -lgfortran -lm -lquadmath -LC:/R/R-34~1.3/bin/x64 -lR
The model has been setup to calculate residuals.
It will be cached for future runs.
Diagonal form: sqrt
     [,1]
[1,]    1
Calculate symbolic inverse...done
Calculate symbolic determinant of inverse...done
Calculate d(Omega)/d(Est) and d(Omega^-1)/d(Est)...
.0
...done.
Calculating Table Variables...
done
nlmixr UI combined dataset and properties
> # Now this shouldn't work, because there are spaces in the current path:
> # NB: The following won't work with spaces in file full file path, don't know why:
> fit.saem.ode <- nlmixr(uif.ode, theo_sd, est="saem")  # fails with spaces in path
Compiling RxODE differential equations...C:/RTools/3.4/mingw_64/bin/gcc  -I"C:/R/R-34~1.3/include" -DNDEBUG          -O2 -Wall  -std=gnu99 -mtune=generic -c rx_7cbddd1a680b22377f3ee375ef2ddda8_x64.c -o rx_7cbddd1a680b22377f3ee375ef2ddda8_x64.o
C:/RTools/3.4/mingw_64/bin/gcc -shared -s -static-libgcc -o rx_7cbddd1a680b22377f3ee375ef2ddda8_x64.dll tmp.def rx_7cbddd1a680b22377f3ee375ef2ddda8_x64.o -LC:/R/R-34~1.3/bin/x64 -lRblas -LC:/R/R-34~1.3/bin/x64 -lRlapack -lgfortran -lm -lquadmath -LC:/R/R-34~1.3/bin/x64 -lR
done.
C:/RTools/3.4/mingw_64/bin/g++  -I"C:/R/R-34~1.3/include" -DNDEBUG       -IC:/Users/James/DOCUME~1/R/WIN-LI~1/3.4/nlmixr/include -IC:/Users/James/DOCUME~1/R/WIN-LI~1/3.4/STANHE~1/include -IC:/Users/James/DOCUME~1/R/WIN-LI~1/3.4/Rcpp/include -IC:/Users/James/DOCUME~1/R/WIN-LI~1/3.4/RCPPAR~1/include -IC:/Users/James/DOCUME~1/R/WIN-LI~1/3.4/RCPPEI~1/include -IC:/Users/James/DOCUME~1/R/WIN-LI~1/3.4/BH/include   -O2 -Wall  -mtune=generic -c saemf68340822c6x64.cpp -o saemf68340822c6x64.o
C:/RTools/3.4/mingw_64/bin/g++ -shared -s -static-libgcc -o saemf68340822c6x64.dll tmp.def saemf68340822c6x64.o C:/Users/James/Documents/academic/conceptual sciences/rx_7cbddd1a680b22377f3ee375ef2ddda8_x64.dll -LC:/R/R-34~1.3/bin/x64 -lRblas -LC:/R/R-34~1.3/bin/x64 -lRlapack -lgfortran -lm -lquadmath -LC:/R/R-34~1.3/bin/x64 -lR
g++.exe: error: C:/Users/James/Documents/academic/conceptual: No such file or directory
g++.exe: error: sciences/rx_7cbddd1a680b22377f3ee375ef2ddda8_x64.dll: No such file or directory
Error in inDL(x, as.logical(local), as.logical(now), ...) : 
  unable to load shared object 'C:/Users/James/Documents/academic/conceptual sciences/saemf68340822c6x64.dll':
  LoadLibrary failure:  The specified module could not be found.
> sink()
> # NB: The following won't work with spaces in file full file path, don't know why:
> fit.saem.ode <- nlmixr(uif.ode, theo_sd, est="saem")  # fails with spaces in path
Compiling RxODE differential equations...done.
C:/RTools/3.4/mingw_64/bin/g++  -I"C:/R/R-34~1.3/include" -DNDEBUG       -IC:/Users/James/DOCUME~1/R/WIN-LI~1/3.4/nlmixr/include -IC:/Users/James/DOCUME~1/R/WIN-LI~1/3.4/STANHE~1/include -IC:/Users/James/DOCUME~1/R/WIN-LI~1/3.4/Rcpp/include -IC:/Users/James/DOCUME~1/R/WIN-LI~1/3.4/RCPPAR~1/include -IC:/Users/James/DOCUME~1/R/WIN-LI~1/3.4/RCPPEI~1/include -IC:/Users/James/DOCUME~1/R/WIN-LI~1/3.4/BH/include   -O2 -Wall  -mtune=generic -c saemf6832983e6bx64.cpp -o saemf6832983e6bx64.o
C:/RTools/3.4/mingw_64/bin/g++ -shared -s -static-libgcc -o saemf6832983e6bx64.dll tmp.def saemf6832983e6bx64.o C:/Users/James/Documents/academic/conceptual sciences/rx_7cbddd1a680b22377f3ee375ef2ddda8_x64.dll -LC:/R/R-34~1.3/bin/x64 -lRblas -LC:/R/R-34~1.3/bin/x64 -lRlapack -lgfortran -lm -lquadmath -LC:/R/R-34~1.3/bin/x64 -lR
g++.exe: error: C:/Users/James/Documents/academic/conceptual: No such file or directory
g++.exe: error: sciences/rx_7cbddd1a680b22377f3ee375ef2ddda8_x64.dll: No such file or directory
Error in inDL(x, as.logical(local), as.logical(now), ...) : 
  unable to load shared object 'C:/Users/James/Documents/academic/conceptual sciences/saemf6832983e6bx64.dll':
  LoadLibrary failure:  The specified module could not be found.
> getwd()
[1] "C:/Users/James/Documents/academic/conceptual sciences"
> # Now to a directory without spaces.
> setwd("C:\\Users\\James\\Documents\\pharmacometrics\\practice\\nlmixr")
> fit.saem.ode <- nlmixr(uif.ode, theo_sd, est="saem")
Compiling RxODE differential equations...done.
C:/RTools/3.4/mingw_64/bin/g++  -I"C:/R/R-34~1.3/include" -DNDEBUG       -IC:/Users/James/DOCUME~1/R/WIN-LI~1/3.4/nlmixr/include -IC:/Users/James/DOCUME~1/R/WIN-LI~1/3.4/STANHE~1/include -IC:/Users/James/DOCUME~1/R/WIN-LI~1/3.4/Rcpp/include -IC:/Users/James/DOCUME~1/R/WIN-LI~1/3.4/RCPPAR~1/include -IC:/Users/James/DOCUME~1/R/WIN-LI~1/3.4/RCPPEI~1/include -IC:/Users/James/DOCUME~1/R/WIN-LI~1/3.4/BH/include   -O2 -Wall  -mtune=generic -c saemf682e2637edx64.cpp -o saemf682e2637edx64.o
C:/RTools/3.4/mingw_64/bin/g++ -shared -s -static-libgcc -o saemf682e2637edx64.dll tmp.def saemf682e2637edx64.o C:/Users/James/Documents/pharmacometrics/practice/nlmixr/rx_7cbddd1a680b22377f3ee375ef2ddda8_x64.dll -LC:/R/R-34~1.3/bin/x64 -lRblas -LC:/R/R-34~1.3/bin/x64 -lRlapack -lgfortran -lm -lquadmath -LC:/R/R-34~1.3/bin/x64 -lR
done.
1:   -3.2788   0.1920  -0.6129   1.9000   0.9500   0.9500   9.8654
2:   -3.3025   0.3758  -0.6708   1.8050   0.9025   0.9025   4.3859
 ... lots of similar lines ...
498:   -3.2179   0.4576  -0.7816   0.0719   0.4173   0.0182   0.4813
499:   -3.2179   0.4576  -0.7816   0.0719   0.4173   0.0182   0.4812
500:   -3.2179   0.4576  -0.7816   0.0719   0.4175   0.0182   0.4811
Diagonal form: sqrt
     [,1]
[1,]    1
Calculate symbolic inverse...done
Calculate symbolic determinant of inverse...done
Calculate d(Omega)/d(Est) and d(Omega^-1)/d(Est)...
.0
...done.
Calculating Table Variables...
done
nlmixr UI combined dataset and properties
 $ par.hist         : Parameter history (if available)
 $ par.hist.stacked : Parameter history in stacked form for easy plotting (if available)
 $ par.fixed        : Fixed Effect Parameter Table
 $ eta              : Individual Parameter Estimates
 $ seed             : Seed (if applicable)
 $ model.name       : Model name (from R function)
 $ data.name        : Name of R data input

(4) Why do the small objects curi and x appear? Object curi appeared with value 0 after executing fit.saem.lin <- nlmixr(uif.lin, theo_sd, est="saem") . Object x is a simple data frame that appears after calling fit.saem.ode or fit.saem.lin. (I was wrong in writing it appeared after calling fit.saem.ode only.)

Finally, a new question based on your answer. You wrote, "The 95% confidence intervals between each of the two model's parameters overlap, so they are not statistically different," but from what you pasted (pasted again below), I only see one set of 95% CI, not two. Does $par.fixed pertain to FOCEI or to NLME based values? If I had to guess, I'd guess FOCEI since it's more accurate.

Parameters ($par.fixed):
        Estimate     SE    CV Untransformed          (95%CI)
tka        0.445  0.198 44.4%          1.56     (1.06, 2.30)
tcl        -3.21 0.0845  2.6%        0.0403 (0.0341, 0.0475)
tv        -0.786 0.0463  5.9%         0.456   (0.416, 0.499)
add.err    0.692                      0.692                 

Thanks again.

mattfidler commented 6 years ago

The curi and the x output are bugs, that were just fixed.

mattfidler commented 6 years ago

To compare SAEM ode and nlme ode side by side, I have:

> fit
nlmixr SAEM fit (ODE)

     OBJF      AIC      BIC Log-likelihood
 116.1369 130.1369 150.3165      -58.06845

Time (sec; $time):
         saem setup Likelihood Calculation covariance table
elapsed 31.75 21.55                   0.55          0  1.07

Parameters ($par.fixed):
        Estimate     SE    CV Untransformed          (95%CI)
tka        0.455  0.197 43.3%          1.58     (1.07, 2.32)
tcl        -3.22 0.0802  2.5%        0.0401 (0.0343, 0.0469)
tv        -0.784 0.0431  5.5%         0.457   (0.420, 0.497)
add.err    0.691                      0.691                 

Omega ($omega):
          eta.ka     eta.cl      eta.v
eta.ka 0.4367946 0.00000000 0.00000000
000 0.06929573 0.00000000
eta.v  0.0000000 0.00000000 0.01842363

Fit Data (object is a modified data.frame):
# A tibble: 132 x 15
  ID     TIME    DV IPRED  PRED   IRES     RES  IWRES    WRES  CWRES CPRED
  <fct> <dbl> <dbl> <dbl> <dbl>  <dbl>   <dbl>  <dbl>   <dbl>  <dbl> <dbl>
1 1     0     0.740  0     0     0.740 0.740    1.07  1.07    1.07    0   
2 1     0.250 2.84   3.87  2.83 -1.03  0.00529 -1.50  0.00308 0.0761  2.67
3 1     0.570 6.57   6.80  5.07 -0.230 1.50    -0.332 0.659   0.614   4.84
# ... with 129 more rows, and 4 more variables: CRES <dbl>, eta.ka <dbl>,
#   eta.cl <dbl>, eta.v <dbl>
> fit.nlme
nlmixr nlme fit by maximum likelihood (ODE)

FOCEi-based goodness of fit metrics:
    OBJF     AIC      BIC Log-likelihood
 116.051 130.051 150.2306       -58.0255

nlme-based goodness of fit metrics:
      AIC      BIC Log-likelihood
 372.6507 392.8304      -179.3254

Time (sec; $time):
        nlme setup FOCEi Evaulate covariance table
elapsed 6.82  0.81           0.24          0  1.18

Parameters ($par.fixed):
        Estimate     SE    CV Untransformed          (95%CI)
tka        0.445  0.198 44.4%          1.56     (1.06, 2.30)
tcl        -3.21 0.0845  2.6%        0.0403 (0.0341, 0.0475)
tv        -0.786 0.0463  5.9%         0.456   (0.416, 0.499)
add.err    0.692                      0.692                 

Omega ($omega):
          eta.ka     eta.cl      eta.v
eta.ka 0.4126161 0.00000000 0.00000000
eta.cl 0.0000000 0.06994126 0.00000000
eta.v  0.0000000 0.00000000 0.01810577

Fit Data (object is a modified data.frame):
# A tibble: 132 x 15
  ID     TIME    DV IPRED  PRED   IRES    RES  IWRES   WRES  CWRES CPRED  CRES
  <fct> <dbl> <dbl> <dbl> <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl> <dbl> <dbl>
1 1     0     0.740  0     0     0.740 0.740   1.07  1.07   1.07    0    0.740
2 1     0.250 2.84   3.86  2.82 -1.02  0.0220 -1.47  0.0131 0.0861  2.65 0.185
3 1     0.570 6.57   6.78  5.05 -0.207 1.52   -0.299 0.684  0.637   4.82 1.75 
# ... with 129 more rows, and 3 more variables: eta.ka <dbl>, eta.cl <dbl>,
#   eta.v <dbl>

I personally don't see any big differences between the two models.

Note the Standard errors are used to calculate the confidence intervals. These come from nlme and saem directly, not from FOCEi. The only thing that FOCEi does is calculate possibly comparible objective functions.

mattfidler commented 6 years ago

When looking at the solved systems side by side I get

> fit
nlmixr nlme fit by maximum likelihood (Solved)

FOCEi-based goodness of fit metrics:
     OBJF      AIC      BIC Log-likelihood
 296.5985 310.5985 330.7781      -148.2992

nlme-based goodness of fit metrics:
      AIC      BIC Log-likelihood
 372.6488 392.8285      -179.3244

Time (sec; $time):
        nlme setup FOCEi Evaulate covariance table
elapsed 2.99  1.56            2.1          0  0.85

Parameters ($par.fixed):
        Estimate     SE    CV Untransformed          (95%CI)
tka        0.445  0.198 44.4%          1.56     (1.06, 2.30)
tcl        -3.21 0.0845  2.6%        0.0403 (0.0341, 0.0475)
tv        -0.786 0.0463  5.9%         0.456   (0.416, 0.499)
add.err    0.692                      0.692                 

Omega ($omega):
         eta.ka    eta.cl     eta.v
eta.ka 0.412483 0.0000000 0.0000000
eta.cl 0.000000 0.0699386 0.0000000
eta.v  0.000000 0.0000000 0.0181094

Fit Data (object is a modified data.frame):
# A tibble: 132 x 15
  ID     TIME    DV IPRED  PRED   IRES    RES  IWRES      WRES  CWRES    CPRED
  <fct> <dbl> <dbl> <dbl> <dbl>  <dbl>  <dbl>  <dbl>     <dbl>  <dbl>    <dbl>
1 1     0     0.740  0     0     0.740 0.740   1.07  0.0000207 -1.00  48128   
2 1     0.250 2.84   3.86  2.82 -1.02  0.0218 -1.47  0.0316    -1.47      3.86
3 1     0.570 6.57   6.78  5.05 -0.207 1.52   -0.299 2.19      -0.299     6.78
# ... with 129 more rows, and 4 more variables: CRES <dbl>, eta.ka <dbl>,
#   eta.cl <dbl>, eta.v <dbl>
> fit.saem
nlmixr SAEM fit (Solved)

     OBJF      AIC      BIC Log-likelihood
 296.8482 310.8482 331.0278      -148.4241

Time (sec; $time):
         saem setup Likelihood Calculation covariance table
elapsed 21.04  1.58                   0.36          0  1.17

Parameters ($par.fixed):
        Estimate     SE    CV Untransformed          (95%CI)
tka        0.450  0.194 43.0%          1.57     (1.07, 2.29)
tcl        -3.22 0.0816  2.5%        0.0401 (0.0342, 0.0471)
tv        -0.784 0.0435  5.6%         0.457   (0.419, 0.497)
add.err    0.692                      0.692                 

Omega ($omega):
          eta.ka    eta.cl      eta.v
eta.ka 0.4187747 0.0000000 0.00000000
eta.cl 0.0000000 0.0696184 0.00000000
eta.v  0.0000000 0.0000000 0.01819676

Fit Data (object is a modified data.frame):
# A tibble: 132 x 15
  ID     TIME    DV IPRED  PRED   IRES    RES  IWRES      WRES  CWRES    CPRED
  <fct> <dbl> <dbl> <dbl> <dbl>  <dbl>  <dbl>  <dbl>     <dbl>  <dbl>    <dbl>
1 1     0     0.740  0     0     0.740 0.740   1.07  0.0000206 -0.999 48098   
2 1     0.250 2.84   3.85  2.82 -1.01  0.0178 -1.46  0.0257    -1.46      3.85
3 1     0.570 6.57   6.76  5.06 -0.191 1.51   -0.276 2.19      -0.276     6.76
# ... with 129 more rows, and 4 more variables: CRES <dbl>, eta.ka <dbl>,
#   eta.cl <dbl>, eta.v <dbl>
> 
mattfidler commented 6 years ago

There is no difference between the solved systems and their results. The difference comes between the solved and ODE objective functions (though the parameters themselves are quite similar). This is because of the increased accuracy of the solved systems. It also shows that objective function differences may not lead to substatially different models.

As far as the fairness of likelihood measures between estimation methods, I think that the only way it would be fair is if the likelihood is exact. Otherwise the biases in the likelihood approximations may give better models for different methods. However, comparing the likelihood of the same method is OK if you believe the assumptions and approximations of the methods are similiar enough between the models.

mattfidler commented 6 years ago

As far as the difference between solved and ODE systems is the compiling of RxODE models. There is some bug there that doesn't allow spaces to be in the system.

mattfidler commented 6 years ago

You can test if spaces are allowed in RxODE style models for SAEM and if the global environment is polluted by updating to the latest nlmixr on github

install_gihub("nlmixrdevelopment/nlmixr")
JSC67 commented 6 years ago

Thank-you, Matt! For a closed topic, there has sure been a lot of work, questions, and comments. I'm happy to have helped you in revealing some bugs. I think there's still at least 1 bug remaining revealed from all this.

As you've laid them side by side for nmle vs. saem comparisons, I agree that the differences are very minor, but I'm still amazed by a 2.6 fold difference in likelihood between ODE-based vs. explicit-based solutions. I downloaded the latest (today's) nlmixr from Github and re-ran the (more explicit) tutorial and got thus:

fit.nlme.lin     <- nlmixr(uif.lin, theo_sd, est="nlme")
fit.nlme.lin.crF <- nlmixr(uif.lin, theo_sd, est="nlme", calc.resid=FALSE)
fit.nlme.ode     <- nlmixr(uif.ode, theo_sd, est="nlme")
fit.nlme.ode.crF <- nlmixr(uif.ode, theo_sd, est="nlme", calc.resid=FALSE)
fit.saem.lin     <- nlmixr(uif.lin, theo_sd, est="saem")
fit.saem.ode     <- nlmixr(uif.ode, theo_sd, est="saem")
# Now compare fits for explicit (i.e., linCmt() ) model vs. ODE, and by algorithms:
fit.nlme.lin      # Log-likelihood: -1527.988 (FOCEi); -179.3246 (nlme)
fit.nlme.lin.crF  # Log-likelihood: -179.3246
fit.nlme.ode      # Log-likelihood: -58.02594 (FOCEi); -179.3253 (nlme)
fit.nlme.ode.crF  # Log-likelihood: -179.3253
fit.saem.lin      # Log-likelihood: -148.4241
fit.saem.ode      # Log-likelihood: -58.07448

Where did the -1527.988 FOCEi value come from for fit.nlme.lin? I re-ran it in the same session and got the value you got earlier:

fit.nlme.lin      # Log-likelihood: -148.2993 (FOCEi); -179.3246 (nlme)

So this anomaly wasn't repeatable. I attach both the edited (for clarity) and unedited (for completeness) R session outputs in case you want to take a look. Also, I tested the saem with ODE-based solution in a file name with spaces, and that worked, so that bug is fixed.

I then repeated this tutorial with a clean slate (no .Rhistory in folder, no History commands in RStudio, fresh session with no C files or DLL files in folder). This time the -1527.988 anomaly did not appear; I got these values which I take as "final":

fit.nlme.lin      # Log-likelihood: -148.2993 (FOCEi); -179.3246 (nlme)
fit.nlme.lin.crF  # Log-likelihood: -179.3246
fit.nlme.ode      # Log-likelihood: -58.02594 (FOCEi); -179.3253 (nlme)
fit.nlme.ode.crF  # Log-likelihood: -179.3253
fit.saem.lin      # Log-likelihood: -148.4241
fit.saem.ode      # Log-likelihood: -58.07448

So, using NLME gives ~ -179 and using the default calc.resid=TRUE option gives FOCEI that agrees with SAEM depending on whether the exact solution was used (~ -148) or the ODE form of the model was used (~ -58). This exercise has been very informative!

Finally, there is still a little bug in that the nuisance, cluttering object x still appeared after running this:

# NLME fittings
fit.nlme.lin     <- nlmixr(uif.lin, theo_sd, est="nlme")
# The cluttering object x has now appeared!
x

tutorial - edited session 2017-02-06.txt tutorial - unedited session 2017-02-06.txt

mattfidler commented 6 years ago
mattfidler commented 6 years ago

I cannot reproduce the "x" in the output.

uif <- function() {
    ini({
        tka <- .5
        tcl <- -3.2
        tv <- -1
        eta.ka ~ 1
        eta.cl ~ 2
        eta.v ~ 1
        add.err <- 0.1
    })
    model({
        ka <- exp(tka + eta.ka)
        cl <- exp(tcl + eta.cl)
        v <- exp(tv + eta.v)
        linCmt() ~ add(add.err)
    })
}
fit <- nlmixr(uif, theo_sd, est="nlme")

After running I get:

> ls()
[1] "fit" "uif"