julia-wrobel / registr

Register exponential family curves
Other
16 stars 4 forks source link

"Initial value in 'vmmin' is not finite" in register_fpca #11

Closed bauer-alex closed 4 years ago

bauer-alex commented 4 years ago

When applying the joint registration and Gaussian FPCA scheme (register_fpca) to the derivation curves of the Berkeley child growth data, the constrOptim() optimization sometimes runs into the error "Initial value in 'vmmin' is not finite". Here a code snippet to reproduce the error, and below some more insight into the problem:

library(registr)
library(dplyr)
library(ggplot2)
library(mgcv)    # for smoothing the raw curves

# Berkeley child growth data
dat <- fda::growth

# calculate the first derivative of the (slightly smoothed) curves --------
# transform to long dataset
dat_long <- data.frame(id    = rep(c(colnames(dat$hgtm), colnames(dat$hgtf)),
                                   each = length(dat$age)),
                       index = rep(dat$age, times = ncol(dat$hgtm) + ncol(dat$hgtf)),
                       value = c(as.vector(dat$hgtm), as.vector(dat$hgtf)))

# slightly smooth the raw curves
smooth_list <- lapply(unique(dat_long$id), function(curve_id) {
  d <- dat_long %>% filter(id == curve_id)
  m <- gam(value ~ s(index, bs = "cr", k = 15), data = d)
  d$value <- unname(predict.gam(m))
  return(d)
})
dat_smooth <- dplyr::bind_rows(smooth_list)

# take the first derivative
deriv_list <- lapply(unique(dat_long$id), function(curve_id) {
  d <- dat_smooth %>% filter(id == curve_id)
  data.frame(id               = curve_id,
             index            = d$index[2:nrow(d)],
             value            = diff(d$value) / diff(d$index),
             stringsAsFactors = FALSE)
})
dat_deriv <- dplyr::bind_rows(deriv_list) %>%
  mutate(id = factor(id))

# plot the first derivative
ggplot(dat_deriv, aes(x = index, y = value, col = id)) +
  geom_line() +
  ggtitle("First derivative (based on smoothed raw curves)") +
  theme(legend.position = "none")

curves

# register the first derivative curves ------------------------------------
res <- register_fpca(Y      = dat_deriv,
                     family = "gaussian",
                     npc    = 4,
                     Kt     = 8,
                     Kh     = 4,
                     max_iterations = 100)
# -> error in iteration 13: 'Initial value in "vmmin" is not finite'

The error occurred in the joint iteration 13 for curve 12, i.e. id boy12. The curve of this id looks like this:

dat_deriv %>%
  filter(id == "boy12") %>%
  ggplot(aes(x = index, y = value)) +
  geom_line() +
  ylim(range(dat_deriv$value))

boy12

What I found out so far:

bauer-alex commented 4 years ago

Additional notes:

I will take a deeper look into this (and the penalize_beta branch, as recommended by @julia-wrobel) after the summer, probably in October.

bauer-alex commented 4 years ago

I could resolve all issues regarding optimization problems like the above one. These problems arose since in the joint iteration the parameter vector passed to constrOptim was sometimes slightly outside the allowed range, or slightly improper / nonmonotone as of numerical problems of constrOptim.

In branch issue_11 I now included a new function ensure_proper_beta that simply slightly alters the parameter vector to resolve these issues before the optimization call. Since the alterations are really minimal this doesn't change the results.

The branch is based on the gfpca_twoStep branch. So we should first integrate the PR for the latter branch, and then create a PR for this one.

julia-wrobel commented 4 years ago

Thanks Alex- gfpca_twStep has been merged with the master branch. Is the issue_11 branch ready to be merged? If so I'll run some local checks and pull it into master.

bauer-alex commented 4 years ago

Yes, this branch is ready to be merged.

julia-wrobel commented 4 years ago

Thanks Alex- it's been merged with the master branch.