openpharma / mmrm

Mixed Models for Repeated Measures (MMRM) in R.
https://openpharma.github.io/mmrm/
Other
110 stars 20 forks source link

Consider nested covariance structures? #174

Open danielinteractive opened 1 year ago

danielinteractive commented 1 year ago

So recently we had a user who wanted to model multiple assessments per visit per subject. We currently cannot model this adequately with the existing covariance structure framework, because:

1) we cannot have multiple observations per visit (checked since #173) 2) also it does not really make sense to just define time = interaction(assessment, visit) and use time

What could be better "nested" covariance structures here? Basically we still want to keep independence between subjects (subject blocks in overall covariance matrix) because that allows us to scale the computations well with number of observations. On the other hand we would like to include additional structure within subjects.

I am wondering if the Kronecker idea (@clarkliming) could be useful here or whether @kkmann has other ideas.

kkmann commented 1 year ago

Very interesting point, I wonder how sas handles this?

This reminds me of the discussion we had a while ago on the relation between MMRM and Gaussian Processes. The Kronecker idea is (I think) equivalent to something that is quite common in the GP literature of composing elementary covariance functions to model more complex structures, see e.g. https://www.cs.toronto.edu/~duvenaud/cookbook/. A population-marginal MMRM is (haven't x-ed it out yet) equivalent to a GP with mean structure (usually the mean structure is not modelled in GP literature) and a covariance function for the Gaussian residuals. The "usual" covariance structures are all stationary covariance functions for a discrete metric on the space of visits. I.e. the covariance only depends on the visits through the number of visits between observations, not the actual timings. The "spatial" covariance structures relax this by depending on the actual time difference. From a covariance function perspective the problem of multiple observations per individual/visit arises because the difference (in either type of structure, spatial or not) being zero, hence the observations would be perfectly correlated and cannot differ (singular vcov matrix). Numerically the fix is to add a diagnoal to the vcov matrix. In covariance function terms that would be the composition of the original kernel with a "nugget" kernel. The Kronecker trick is (I think) nothing but the product of two covariance functions (see e.g. http://gaussianprocess.org/gpml/chapters/RW4.pdf or the kernel cookbook link above again). Here it seems more natural to add kernels though. If mmrm were to support such compositions (product and weighted sum), this would also allow to perform implicit covariance structure selection if the weights of a sum of covariance structures are also estimated. Probably hard to identify well in practice though.

meixide commented 1 year ago

Let us assume we have two measurements, A and B, per visit per patient. Then Daniel suggested to write each individual-block in the overall covariance matrix as (assuming there are only two visits present in the data):

image

So a very useful starting point would be to allow the user to specify, for example, an ante-dependence structure for each block in the image above.

kkmann commented 1 year ago

Yes, I t agree with that. My point is more about implementation. If we worked with covariance functions under the hood, there would not need to be a special case for multiple observations. The covariance could depend on the visit difference (see refs above). Then, in your notation, the "difference" between A_i and B_i is zero, hence the respective variance is returned. Now, to support heterogeneous variances, the covariance function would have to be non-stationary. Hm, what would be the signature of a general covariance function, probably cov(visit_1, visit_2, theta) where theta are the covariance function specific parameters. The advantage is that this formulation is completely independent of the sequence of observations.

danielinteractive commented 1 year ago

@kkmann Not sure if I understand what you write :-D But I guess you mean that we could formulate also the non-spatial covariance structures in a similar way as we do for the spatial covariance structures? And this could maybe be helpful for more complex covariance structures such as the nested one here?

Can we maybe still also think about a lower bar option? E.g. if we still keep the non-spatial vs. spatial differentiation, how would we formulate a covariance structure for the nested case? e.g. the simple 2 visits, 2 subjects case that @garcia-mc has above.

kkmann commented 1 year ago

x) thanks for translating.

I think we all agree that it is statistically fine to have multiple obs (possible in SAS?) we just need to define the variance. Implementing this via covariance functions just removes the need to manually align the blocks (you do not even have to order visit times). What exactly do you mean with "covariance structure for the nested case" then? To me a covariance structure is a function mapping from a set of parameters and a dimension (number of visits) to a covariance matrix like here: https://github.com/openpharma/mmrm/blob/0c5366a87ed66e50e3b7c25d31073586c568e90c/src/covariance.h#L10

A fancy way would be to define an abstract class "covariance function" and overload the "()" to be able to treat it like a function as well. One could then define sums and multiples of covariance functions. A covariance matrix is then generated on the fly from the respective covariance function (or a composition thereof). That would be pretty flexible and remove the need to sort anything or worry about duplicates. Otherwise this could quickly become an "index battle" x)

danielinteractive commented 1 year ago

@kkmann ok and if we had these sums and multiples of covariance functions, how would we use this for the nested use case above? can we write this down in math formula? Just for my brain to be able to parse it.

kkmann commented 1 year ago

I wonder, does it make sense to go via a design/ document? Realistically I will only get to this mid next week tho. We could play around with an R implementation of the covariance structure and then translate it to cpp once we feel it is helpful. The algebra of covariance functions is not even strictly necessary to solve that. Also, I think it might be better to separate correlation structure and variances. Could be benficial since the correlation structure will almost always be stationary, but we need non-stationarity of the variance function to implement heteroscedasticity.

danielinteractive commented 1 year ago

Yeah design doc for this would be awesome!

kkmann commented 1 year ago

I hope this is not too time critical x) I'd need some muse to fiddle with this.

danielinteractive commented 1 year ago

@kkmann no worries, not time critical at all

danielinteractive commented 1 year ago

Got another question on this today (as part of our BBS workshop). So seems there is some interest in this.

mrmorgan17 commented 8 months ago

I would just like to express my interest in having the functionality of nested covariance structures available in the mmrm package.

It appears that SAS implements this through the use of a Kronecker product covariance matrix:

/* Implements Kronecker product covariance matrix */
/* Unstructured over time and compound symmetric for endpoints at same time */
PROC MIXED DATA = DATA;
  CLASS TRT TIME SUBJECT PARAMCD;
  MODEL CHANGE = TRT TIME PARAMCD BASVAL BASVAL*TIME PARAMCD*TIME PARAMCD*TRT PARAMCD*BASVAL
              PARAMCD*TRT*TIME / DDFM=KR;   
  REPEATED TIME PARAMCD / SUBJECT = SUBJECT TYPE=UN@CS;
RUN;
wlandau commented 1 month ago

Possibly related as precedent: brms uses distributional regression to allow residual variances to depend on more than just the time variable. brms.mmrm leverages this in brm_formula_sigma() to define formulas like:

chg ~ group * time + baseline
sigma ~ group * time + baseline

where sigma in the formula above represents the log-scale standard deviation of the residuals.

None of this is affects the correlation matrix, however.