inlabru-org / inlabru

inlabru
https://inlabru-org.github.io/inlabru/
76 stars 21 forks source link

multicolumn covariate matrix support (was: inla.stack handling subtly ignores special features) #35

Closed finnlindgren closed 1 year ago

finnlindgren commented 6 years ago

The inla.stack.mjoin function and its relatives ignore subtle internal structures of the INLA::inla.stack functions, such as keeping track of what is a matrix (or not) in the input. Related to #33 and #34.

Some of those features are currently (v2.1.4) likely not accessible via the inlabru model interface, but robustifying the inlabru internals will allow those features later. Having a collection of covariates stored as a multicolumn matrix that can be used directly in a formula instead of specifying each individual covariate is the main such feature.

The main changes are to use inla.stack.LHS() and inla.stack.RHS() instead of directly accessing and/or changing the internal data structure.

The documentation for inla.stack.mdata claims that INLA::inla.stack.data does not produce a multiple-column observation matrix, but this is incorrect. The issue is that the internal stack data structure is accessed in several places without taking the internal matrix information into account, so that gets lost somewhere on the way, before inla.stack.mdata is called. Using inla.stack.LHS() and inla.stack.RHS() (inla.stack.data() is simply the union of those two, plus additional user supplied objects) is the supported way of accessing this in a way that reconstructs any matrices that were used as inla.stack input.

inla.stack.e is not needed at all (relates to #34) since E should be a vector, not a matrix. As far as I can tell, the separately constructed matrix-e is never accessed; in bru.inference.R, the code is E = INLA::inla.stack.data(stk)$e, which accesses the regular e information in the stack, whereas inla.stack.mjoin places it in a special place that is never accessed (whereas the special "y" is accessed, by inla.stack.mdata).

inla.stack.y should expand a "y" vector to multicolumn data within each stack. After that inla.stack.join will create the correct internal information. By solving #33, there will no longer be any need to store the "y" and "e" separately, as the regular stack structure will have the required information.

finnlindgren commented 6 years ago

Solving #33 would also mean that there would be no need for the extra mdata$y.inla step in inla.stack.mdata, as there would already be exactly that information under a special inlabru name. (In fact, inla.stack.mdata could just be an alias for INLA::inla.stack.data.

finnlindgren commented 6 years ago

All of the above has been fixed in the development version, except for multicolumn matrix covariate collection support.

finnlindgren commented 3 years ago

This could easily be handled via a mapper class and special model name for "iid with precision set to the fixed effect precision" (also needed for factors), or if "linear" can detect matrix input and change to fixed-precision-iid.

finnlindgren commented 1 year ago

The special model="fixed" feature solves this.