metrumresearchgroup / bbr

R interface for model and project management
https://metrumresearchgroup.github.io/bbr/
Other
22 stars 2 forks source link

tweak_initial_estimates: Matrix handling #640

Open barrettk opened 5 months ago

barrettk commented 5 months ago

During development of the tweak_inital_estimates PR, we determined that some additional checks and modification is required for matrix-type records (OMEGA and SIGMA).

For one, we have to check that jittered values still result in a positive-definite matrix. Altering off-diagonal values seems to increase the likelihood of this, but doesn't make the problem any more difficult to solve.

Proposal:

Other questions:

Examples to think about

$omega 1 sd fix 2
$theta 10 11
$omega block(2) fix
3
0.1 cor 4
$omega block(2)
3 fix
0.1 uni 4
barrettk commented 5 months ago

FYI It was determined that we do in fact need additional handling for cor matrices. In this case, the diagonals are variances, and off-diagonal values specify correlation. In these instances, we first have to turn the matrix into a correlation or covariance matrix, check for positive-definiteness/making necessary adjustments, and then convert it back to the correct format (variance on diagonal, correlation on off-diagonal)

barrettk commented 5 months ago

We also determined that it would not be possible to check for positive-definiteness using the full matrix in the example below:

> nmrec::extract_omega(ctl)
      [,1]  [,2]  [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 0.500    NA    NA   NA   NA   NA   NA   NA
[2,] 0.001 0.500    NA   NA   NA   NA   NA   NA
[3,] 0.001 0.001 0.500   NA   NA   NA   NA   NA
[4,] 0.001 0.001 0.001  0.5   NA   NA   NA   NA
[5,]    NA    NA    NA   NA 0.01   NA   NA   NA
[6,]    NA    NA    NA   NA 0.00 0.01   NA   NA
[7,]    NA    NA    NA   NA 0.00 0.00 0.01   NA
[8,]    NA    NA    NA   NA 0.00 0.00 0.00 0.01

We would first have to convert NA values to 0 to run Matrix::nearPD, and then change them back to NA after positive-definiteness was achieved. This is necessary, as the indices of NA occurrences are be used to tell nmrec to 'skip' these values when setting them in the control stream file (also denoting which values were actually specified in the control stream).

This presents a problem, as this could change the 0's to some other value to ensure positive-definiteness, and turning them back to NA would break this.

I proposed an alternative: we follow a similar approach for each matrix-type record, rather than the combined matrix. We would then be checking these two matrices separately:

      [,1]  [,2]  [,3]  [,4]
[1,] 0.500    NA    NA   NA 
[2,] 0.001 0.500    NA   NA   
[3,] 0.001 0.001 0.500   NA 
[4,] 0.001 0.001 0.001  0.5 
     [,1]  [,2] [,3] [,4]
[1,] 0.01   NA   NA   NA
[2,] 0.00 0.01   NA   NA
[3,] 0.00 0.00 0.01   NA
[4,] 0.00 0.00 0.00 0.01

After they are confirmed to be positive-definite, we could the diagonally concatenate them, and set the other values to NA. @curtisKJ suggested the simpar package did something pretty similar.

cc @kyleam - this would require extract_omega to return the individual matrices (per record). I really like the full matrix output, so perhaps that could either A), be an attribute, or B) we could have a helper for formatting a given record as a matrix? Im sure we will need to discuss this more/provide more background, but just giving you some preliminary background on this discussion.

barrettk commented 5 months ago

cc @seth127 @curtisKJ @timwaterhouse The only issue im seeing right now after my investigation, is rounding (again). After making the matrix positive-definite, rounding can make it no longer positive-definite. That being said, the example I tested included a negative covariance, which I havent seen in any examples before. I do think rounding could be an issue still though.

We can't follow the method we used for $THETA records (in terms of addressing rounding issues with bounds), as the loop for checking positive-definiteness would become a bit absurd. Im starting to think we should just remove the rounding as an argument, as it just seems like a lot to juggle while also balancing positive-definiteness.

Functions used in this example

Functions ```r #' Function to check and modify a matrix for positive-definiteness #' #' @param mat a symmetric matrix, or a matrix with only lower triangular values #' specified (assumed to be symmetric). #' @param corr logical indicating if the matrix should be a correlation matrix. #' #' @keywords internal check_and_modify_pd <- function(mat, corr = FALSE, digits = 3) { # Coerce NA values to 0 for checking positive-definiteness # pass in original matrix to determine placement of NA values mat_zero_spec <- fmt_mat_zeros(mat) mat_zero <- mat_zero_spec$mat # TODO: check if matrix is correlation or covariance matrix, and pass to nearPD if (!is_positive_definite(mat_zero)) { mat_zero <- Matrix::nearPD( mat_zero, base.matrix = TRUE, corr = corr, ensureSymmetry = TRUE)$mat } # Round and ensure matrix is still postive-definite mat_zero <- round(mat_zero, digits) is_positive_definite(mat_zero) # Reset original NA values mat_zero[mat_zero_spec$na_indices] <- NA return(mat_zero) } #' Check if matrix is positive-definite #' #' Assumes matrix is a square matrix, is symmetric, and contains only numeric #' values #' #' @details #' References: #' https://github.com/TomKellyGenetics/matrixcalc/blob/master/R/is.positive.definite.R #' https://www.geeksforgeeks.org/fixing-non-positive-definite-correlation-matrices-using-r/ #' #' @param mat a square matrix #' #' @keywords internal is_positive_definite <- function(mat){ eigenvalues <- eigen(mat, only.values = TRUE)$values return(all(eigenvalues > 0)) } #' Make matrix symmetric and format NA values zeros #' #' @inheritParams check_and_modify_pd #' #' @returns a list containing the formatted matrix, and orignal `NA` indices #' #' @keywords internal fmt_mat_zeros <- function(mat){ # Store *original* NA indices na_indices <- is.na(mat) # make matrix symmetric mat_sym <- make_mat_symmetric(mat) # Set current NA values to 0 (if any) mat_sym[is.na(mat_sym)] <- 0 # Return new matrix and indices of NA values return( list( mat = mat_sym, na_indices = na_indices ) ) } #' Make a lower triangular matrix a symmetric one #' #' @inheritParams check_and_modify_pd #' #' @noRd make_mat_symmetric <- function(mat){ mat[upper.tri(mat)] <- mat[lower.tri(mat)] return(mat) } #' Check if matrix is symmetric (with no tolerance) #' #' @param mat a square matrix #' #' @noRd mat_is_symmetric <- function(mat){ Matrix::isSymmetric(mat, checkDN = FALSE, tol = 0) } ```

Example

In this example, I am only checking a specific $OMEGA record, rather than the full matrix.

For 1001.ctl (modified: first record has two diagonal values):

$OMEGA
0.09          ;[P] KOUThealthy
0.07          ; new eta value
$OMEGA BLOCK (3)
    0.09          ;[P] KOUTdmd
    0.01          ;[P] cov21
    0.09          ;[P] KINdmd
    0.01          ;[P] cov31
    0.01          ;[P] cov32
    0.09          ;[P] SCALAR
.mod <- read_model(file.path(MODEL_DIR_X, "1001"))
omega_full <- get_initial_est(.mod)$omegas

# Ideally, this separation would be done on the `nmrec` side
omega_1 <- omega_full[1:2, 1:2]
omega_2 <- omega_full[3:5, 3:5]

> omega_full
     [,1] [,2] [,3] [,4] [,5]
[1,] 0.09   NA   NA   NA   NA
[2,]   NA 0.07   NA   NA   NA
[3,]   NA   NA 0.09   NA   NA
[4,]   NA   NA 0.01 0.09   NA
[5,]   NA   NA 0.01 0.01 0.09
> omega_2
     [,1] [,2] [,3]
[1,] 0.09   NA   NA
[2,] 0.01 0.09   NA
[3,] 0.01 0.01 0.09

# Set a value as negative to ensure matrix is -not- positive-definite
> omega_2[3,3] <- -0.3
> omega_2
     [,1] [,2] [,3]
[1,] 0.09   NA   NA
[2,] 0.01 0.09   NA
[3,] 0.01 0.01 -0.3

Illustration of rounding being an issue ```r # Original Matrix > mat <- omega_2 > mat [,1] [,2] [,3] [1,] 0.09 NA NA [2,] 0.01 0.09 NA [3,] 0.01 0.01 -0.3 # Coerce NA values to 0 for checking positive-definiteness # pass in original matrix to determine placement of NA values > mat_zero_spec <- fmt_mat_zeros(mat) > mat_zero <- mat_zero_spec$mat > mat_zero [,1] [,2] [,3] [1,] 0.09 0.01 0.01 [2,] 0.01 0.09 0.01 [3,] 0.01 0.01 -0.30 > # Make matrix positive-definite if it isn't already > if (!is_positive_definite(mat_zero)) { + mat_zero <- Matrix::nearPD( + mat_zero, base.matrix = TRUE, + corr = corr, ensureSymmetry = TRUE)$mat + } > # New positive-definite matrix > mat_zero [,1] [,2] [,3] [1,] 0.090187111 0.010187111 0.0025062166 [2,] 0.010187111 0.090187111 0.0025062166 [3,] 0.002506217 0.002506217 0.0001251551 > is_positive_definite(mat_zero) [1] TRUE > # Round and ensure matrix is still postive-definite > mat_zero <- round(mat_zero, digits) > mat_zero [,1] [,2] [,3] [1,] 0.090 0.010 0.003 [2,] 0.010 0.090 0.003 [3,] 0.003 0.003 0.000 # No longer positive-definite after rounding > is_positive_definite(mat_zero) [1] FALSE ``` In the event the matrix was actually positive-definite here after rounding, I would have then done the following: ```r > # Reset original NA values > mat_zero[mat_zero_spec$na_indices] <- NA > mat_zero [,1] [,2] [,3] [1,] 0.090 NA NA [2,] 0.010 0.090 NA [3,] 0.003 0.003 0 ``` After doing this for each record, I would then diagonally concatenate the two matrices again: ```r # `check_and_modify_pd()` handles all the steps illustrated above^ omega_1 <- check_and_modify_pd(omega_1) omega_2 <- check_and_modify_pd(omega_2) # combine matrices into full matrix # Note: I wouldn't use this method, as it doesnt allow NAs, but showing for illustration purposes omega_full <- as.matrix(Matrix::bdiag(list(omega_1, omega_2))) ``` - I would then check for positive definiteness for the combined matrix (with zeros instead of NAs) - I would have a `dev_bug()` in the event it wasn't, because everything ive read indicated diagonally concatenated matrices should retain their positive-definiteness
barrettk commented 5 months ago

FWIW I tried something like the following, in an attempt to use the rounded matrix as the starting point, and that didnt work either:

while (iteration <= 100 && !is_positive_definite(mat_zero)) {
  # Use nearPD to ensure positive-definiteness
  mat_zero <- Matrix::nearPD(
    mat_zero, base.matrix = TRUE,
    corr = corr, ensureSymmetry = TRUE)$mat

  # Apply rounding to the positive-definite matrix
  mat_zero <- round(mat_zero, digits)

  # Increment iteration counter
  iteration <- iteration + 1
}

In this example, it does work with digits = 5, but obviously we cant rely on that for any type of matrix. We could attempt to "shrink" the number of required decimal places using a similar approach (iteratively increase digits), but is that really worth it if it doesnt match the specified number of digits?

However, it's possible the example im working with is flawed for one reason or another. Perhaps, we would be able to round in some, or even most cases. We could attempt to round, but not require it if it doesnt seem possible. Maybe something like this:

Example

> omega_1
     [,1] [,2]
[1,] 0.09   NA
[2,]   NA 0.07
> omega_2
     [,1] [,2] [,3]
[1,] 0.09   NA   NA
[2,] 0.01 0.09   NA
[3,] 0.01 0.01 -0.3
> check_and_modify_pd(omega_1)
     [,1] [,2]
[1,] 0.09   NA
[2,]   NA 0.07
> check_and_modify_pd(omega_2)
A record could not rounded while ensuring postive-definiteness
            [,1]        [,2]         [,3]
[1,] 0.090187111          NA           NA
[2,] 0.010187111 0.090187111           NA
[3,] 0.002506217 0.002506217 0.0001251551
kyleam commented 5 months ago

Some scattered thoughts...

FYI It was determined that we do in fact need additional handling for cor matrices. In this case, the diagonals are variances, and off-diagonal values specify correlation.

I'm confused by what you're referring to as "cor" matrices.

Given these NONMEM options

... I guess by "cor" matrix you're referring to a block with marked with CORRELATION along with VARIANCE?

Wouldn't you also need to handle the mismatch in the other direction (i.e. STANDARD with COVARIANCE)?

Also, I guess CHOLESKY representations need to be considered.


I'm not spotting any discussion on xN values or VALUES(diag,odiag), at least explicitly. Say you have a record like

$OMEGA BLOCK(6)
0.1
0.01 0.1
(0.01)x2 0.1
(0.01)x3 0.1
(0.01)x4 0.1
(0.01)x5 0.1

where extract_omega would give you

      [,1] [,2] [,3] [,4] [,5] [,6]
[1,] 20.10   NA   NA   NA   NA   NA
[2,] 20.01 20.1   NA   NA   NA   NA
[3,] 20.01   NA 20.1   NA   NA   NA
[4,] 20.01   NA   NA 20.1   NA   NA
[5,] 20.01   NA   NA   NA 20.1   NA
[6,] 20.01   NA   NA   NA   NA 20.1

I take it the idea is to fill in those repeat values before computing the nearest positive definite matrix. If that's the case, even if those repeated values are filled in (by bbr or by nmrec or by using final results of previous run), set_omega will overwrite only the explicit values. So, doesn't that mean the value that actually ends up in the control stream may not line up with the computed value? That seems like a showstopper that would require a major update to the nmrec::set_* helpers to solve.


For the rounding issues, have you checked whether you can just let NONMEM deal with those? From the docs:

With NONMEM 7.3 if the initial estimate of a block is not positive
definite because of rounding errors, a value will be added to the
diagonal elements to make it positive definite. A message in the
NONMEM report file will indicate that this was done.  E.g.,

  DIAGONAL SHIFT OF  1.1000E-03 WAS IMPOSED TO ENSURE POSITIVE DEF-
  INITENESS
barrettk commented 5 months ago

In terms of cor, I was referring to the example I show in the main issue (shorthand for CORRELATION). @curtisKJ indicated that this meant the off diagonals specified correlations, but the diagonals were still variances, meaning the matrix has to first be converted to either a correlation matrix, or a covariance matrix, before checking for positive-definiteness.

In terms of the other flags you mentioned, I hadn't looked into them too closely, but yeah we would absolutely have to convert those as well. nearPD can only take a covariance or correlation matrix, so we would definitely have to convert STANDARD diagonals as well.

I take it the idea is to fill in those repeat values before computing the nearest positive definite matrix. If that's the case, even if those repeated values are filled in (by bbr or by nmrec or by using final results of previous run), setomega will overwrite only the explicit values. So, doesn't that mean the value that actually ends up in the control stream may not line up with the computed value? That seems like a showstopper that would require a major update to the nmrec::set* helpers to solve.

I assume the SAME() flag is similar to the example you provided, though I can definitely see some differences. Seth mentioned the SAME() flag late yesterday, and it wasnt something I had thought about. Im a little confused by your example though -where is the 20 coming from? (assuming that's a typo).

My original idea was to just overwrite the values not specified in the control stream (as an actual value) with NA. That wouldnt work though: nearPD would likely change each of those repeated values to ensure positive-definiteness, and there isnt a way to ensure they're all the same (you can only use keepDiag = TRUE). I could envision a way to handle that during the jittering portion, but not during the nearPD() call. Im honestly not sure how we could address that.

kyleam commented 5 months ago

Im a little confused by your example though -where is the 20 coming from? (assuming that's a typo).

Sorry, just a typo (copied the wrong output):

     [,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0.10   NA   NA   NA   NA   NA
[2,] 0.01  0.1   NA   NA   NA   NA
[3,] 0.01   NA  0.1   NA   NA   NA
[4,] 0.01   NA   NA  0.1   NA   NA
[5,] 0.01   NA   NA   NA  0.1   NA
[6,] 0.01   NA   NA   NA   NA  0.1
kyleam commented 5 months ago

In terms of cor, I was referring to the example I show in the main issue (shorthand for CORRELATION). @curtisKJ indicated that this meant the off diagonals specified correlations, but the diagonals were still variances [...]

Okay, so in that particular example, given that the VARIANCE default. But yes, as far as I understand, you need to consider the other flags when making that decision, rather than just say CORR means X. In other words, the sets of options interact. I think the combinations you need to worry about are VARIANCE + CORRELATION and STANDARD + COVARIANCE as well as CHOLESKY.