Increasing the number of iterations may lead the algorithm to error out

jranke commented 2 years ago

The following reproducible example fits a biexponential decline curve to the dimethenamid data available from the mkin package. I assume that the problem is caused by some kind of overparameterisation. However, I think the algorithm should not simply stop, but return the current solution, with a warning that is as specific as possible on which part of the model is overparameterised.

library(saemix)
#> Loading required package: npde
#> Package saemix, version 3.0
#>   please direct bugs, questions and feedback to emmanuelle.comets@inserm.fr
#> 
#> Attaching package: 'saemix'
#> The following objects are masked from 'package:npde':
#> 
#>     kurtosis, skewness
library(mkin)
#> Loading required package: parallel
require(purrr)
#> Loading required package: purrr

# Data
dmta_ds <- lapply(1:7, function(i) {
  ds_i <- dimethenamid_2018$ds[[i]]$data
  ds_i[ds_i$name == "DMTAP", "name"] <-  "DMTA"
  ds_i$time <- ds_i$time * dimethenamid_2018$f_time_norm[i]
  ds_i
})
names(dmta_ds) <- sapply(dimethenamid_2018$ds, function(ds) ds$title)
dmta_ds[["Elliot"]] <- rbind(dmta_ds[["Elliot 1"]], dmta_ds[["Elliot 2"]])
dmta_ds[["Elliot 1"]] <- dmta_ds[["Elliot 2"]] <- NULL

ds_saemix_all <- purrr::map_dfr(dmta_ds, function(x) x, .id = "ds")
ds_saemix <- subset(ds_saemix_all, !is.na(value) & name == "DMTA",
  c("ds", "name", "time", "value"))

dmta_data <- saemix::saemixData(ds_saemix,
  name.group = "ds",
  name.predictors = c("time", "name"),
  name.response = "value",
  verbose = FALSE)

# Model
dfop_function <- function(psi, id, xidep) {
  g <- psi[id, 4]
  t <- xidep[, "time"]
  psi[id, 1] * (g * exp(- psi[id, 2] * t) +
  (1 - g) * exp(- psi[id, 3] * t))
}

psi0_dfop <- matrix(c(98.8, 0.087, 0.01, 0.93), nrow = 1,
  dimnames = list(NULL, c("DMTA_0", "k1", "k2", "g")))

dfop_model <- saemix::saemixModel(dfop_function, psi0 = psi0_dfop,
    "SFO dimethenamid", transform.par = c(0, 1, 1, 3),
    error.model = "combined", verbose = FALSE)

# Fits
saemix_dfop_800_100 <- saemix(dfop_model, dmta_data,
  saemixControl(print = FALSE, save = FALSE, save.graphs = FALSE,
    nbiter.saemix = c(800, 100), nb.chain = 9))
plot(saemix_dfop_800_100, plot.type = "convergence")


saemix_dfop_1000_100 <- try(saemix(dfop_model, dmta_data,
  saemixControl(print = FALSE, save = FALSE, save.graphs = FALSE,
    nbiter.saemix = c(1000, 100), nb.chain = 9)))
#> Error in solve.default(omega.eta) : 
#>   system is computationally singular: reciprocal condition number = 6.55036e-17

^{Created on 2022-02-09 by the reprex package (v2.0.1)}

Session info

``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.1.2 (2021-11-01) #> os Debian GNU/Linux 11 (bullseye) #> system x86_64, linux-gnu #> ui X11 #> language en #> collate de_DE.UTF-8 #> ctype de_DE.UTF-8 #> tz Europe/Berlin #> date 2022-02-09 #> pandoc 2.9.2.1 @ /usr/bin/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.0) #> backports 1.4.1 2021-12-13 [1] CRAN (R 4.1.2) #> cli 3.1.1 2022-01-20 [1] CRAN (R 4.1.2) #> colorspace 2.0-2 2021-06-24 [1] CRAN (R 4.1.0) #> crayon 1.4.2 2021-10-29 [1] CRAN (R 4.1.2) #> curl 4.3.2 2021-06-23 [1] CRAN (R 4.1.2) #> DBI 1.1.2 2021-12-20 [1] CRAN (R 4.1.2) #> deSolve 1.30 2021-10-07 [1] CRAN (R 4.1.1) #> digest 0.6.29 2021-12-01 [1] CRAN (R 4.1.2) #> dplyr 1.0.8 2022-02-08 [1] CRAN (R 4.1.2) #> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.1.0) #> evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.0) #> fansi 1.0.2 2022-01-14 [1] CRAN (R 4.1.2) #> fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.0.3) #> fs 1.5.2 2021-12-08 [1] CRAN (R 4.1.2) #> generics 0.1.2 2022-01-31 [1] CRAN (R 4.1.2) #> ggplot2 3.3.5 2021-06-25 [1] CRAN (R 4.1.0) #> glue 1.6.1 2022-01-22 [1] CRAN (R 4.1.2) #> gridExtra 2.3 2017-09-09 [1] CRAN (R 4.0.0) #> gtable 0.3.0 2019-03-25 [1] CRAN (R 4.0.0) #> highr 0.9 2021-04-16 [1] CRAN (R 4.1.0) #> htmltools 0.5.2 2021-08-25 [1] CRAN (R 4.1.1) #> httr 1.4.2 2020-07-20 [1] CRAN (R 4.0.2) #> knitr 1.37 2021-12-16 [1] CRAN (R 4.1.2) #> lattice 0.20-45 2021-09-22 [1] CRAN (R 4.1.1) #> lifecycle 1.0.1 2021-09-24 [1] CRAN (R 4.1.1) #> lmtest 0.9-39 2021-11-07 [1] CRAN (R 4.1.2) #> magrittr 2.0.2 2022-01-26 [1] CRAN (R 4.1.2) #> mclust 5.4.9 2021-12-17 [1] CRAN (R 4.1.2) #> mime 0.12 2021-09-28 [1] CRAN (R 4.1.1) #> mkin * 1.0.5 2021-09-15 [1] CRAN (R 4.1.2) #> munsell 0.5.0 2018-06-12 [1] CRAN (R 4.0.0) #> nlme 3.1-155 2022-01-13 [1] CRAN (R 4.1.2) #> npde * 3.2 2021-12-20 [1] CRAN (R 4.1.2) #> pillar 1.7.0 2022-02-01 [1] CRAN (R 4.1.2) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.0.0) #> purrr * 0.3.4 2020-04-17 [1] CRAN (R 4.0.0) #> R.cache 0.15.0 2021-04-30 [1] CRAN (R 4.1.0) #> R.methodsS3 1.8.1 2020-08-26 [1] CRAN (R 4.0.3) #> R.oo 1.24.0 2020-08-26 [1] CRAN (R 4.0.2) #> R.utils 2.11.0 2021-09-26 [1] CRAN (R 4.1.1) #> R6 2.5.1 2021-08-19 [1] CRAN (R 4.1.1) #> reprex 2.0.1 2021-08-05 [1] CRAN (R 4.1.1) #> rlang 1.0.1 2022-02-03 [1] CRAN (R 4.1.2) #> rmarkdown 2.11 2021-09-14 [1] CRAN (R 4.1.1) #> saemix * 3.0 2022-02-08 [1] CRAN (R 4.1.2) #> scales 1.1.1 2020-05-11 [1] CRAN (R 4.1.1) #> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.1.2) #> stringi 1.7.6 2021-11-29 [1] CRAN (R 4.1.2) #> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.0.0) #> styler 1.6.2 2021-09-23 [1] CRAN (R 4.1.1) #> tibble 3.1.6 2021-11-07 [1] CRAN (R 4.1.2) #> tidyselect 1.1.1 2021-04-30 [1] CRAN (R 4.1.0) #> utf8 1.2.2 2021-07-24 [1] CRAN (R 4.1.0) #> vctrs 0.3.8 2021-04-29 [1] CRAN (R 4.1.0) #> withr 2.4.3 2021-11-30 [1] CRAN (R 4.1.2) #> xfun 0.29 2021-12-14 [1] CRAN (R 4.1.2) #> xml2 1.3.3 2021-11-30 [1] CRAN (R 4.1.2) #> yaml 2.2.2 2022-01-25 [1] CRAN (R 4.1.2) #> zoo 1.8-9 2021-03-09 [1] CRAN (R 4.0.5) #> #> [1] /usr/local/lib/R/site-library #> [2] /usr/lib/R/site-library #> [3] /usr/lib/R/library #> #> ────────────────────────────────────────────────────────────────────────────── ```

jranke commented 2 years ago

Interestingly, the estimate for the variance of k2 approaches zero with increasing interations. This may be the problem, as k2 is ill-defined in some of the groups:

print(saemix_dfop_800_100)
....
----------------------------------------------------
----                  Results                   ----
----------------------------------------------------
-----------------  Fixed effects  ------------------
----------------------------------------------------
     Parameter Estimate SE     CV(%)
[1,] DMTA_0    98.2715  0.9925  1.0 
[2,] k1         0.0645  0.0157 24.3 
[3,] k2         0.0093  0.0013 14.3 
[4,] g          0.9496  0.0224  2.4 
[5,] a.1        1.0687  0.1021  9.5 
[6,] b.1        0.0295  0.0036 12.2 
----------------------------------------------------
-----------  Variance of random effects  -----------
----------------------------------------------------
       Parameter     Estimate SE   CV(%)  
DMTA_0 omega2.DMTA_0 4.1e+00  3.37 8.2e+01
k1     omega2.k1     3.5e-01  0.20 5.8e+01
k2     omega2.k2     3.9e-12  0.03 7.7e+11
g      omega2.g      1.1e+00  0.69 6.5e+01
----------------------------------------------------
------  Correlation matrix of random effects  ------
----------------------------------------------------
              omega2.DMTA_0 omega2.k1 omega2.k2 omega2.g
omega2.DMTA_0 1             0         0         0       
omega2.k1     0             1         0         0       
omega2.k2     0             0         1         0       
omega2.g      0             0         0         1       
----------------------------------------------------

This makes me wonder if the variance parameters are fitted on the log scale, and if not, wouldn't you agree that this would be promising in this case? Of course a log cholesky parameterisation of Omega as possible in nlme may be even smarter, but I am afraid that it would complicate the calculation of confidence intervals (the intervals() function in nlme can do it, but I don't know how). In case of a log parameterisation, the confidence interval could be calculated on the log scale, and then backtransformed to the natural scale to give an aysmmetric confidence interval.

jranke commented 2 years ago

BTW, increasing the number of chains to 100 makes it possible to do 10 000 iterations in the first phase. The estimate for omega.k2 obtained is 3.26e-11. Apparently, increasing the number of chains in this case means that omega.k2 approaches zero more slowly.

ecomets commented 2 years ago

As discussed by email, this is not an algorithmic issue per se. In this case it seems to be related to parameter identifiability, but it should really be up to the user to check whether their data is compatible with the complexity of the model they are trying to fit.

saemixdevelopment / saemixextension

Increasing the number of iterations may lead the algorithm to error out #29