athowes / multi-agyw

Spatio-temporal estimates of HIV risk group proportions for AGYW across 13 priority countries in sub-Saharan Africa
https://athowes.github.io/multi-agyw
MIT License
3 stars 2 forks source link

Add category x space x survey interactions to model comparison #25

Closed athowes closed 2 years ago

athowes commented 3 years ago

Can you use three options with R-INLA's group option somehow to do this? I guess worst case use the generic to define the space x survey interaction and then put an IID group for category on top of it

athowes commented 3 years ago

Think it's going to have to be like:

inla.rgeneric.besagxar1.model <- function(cmd = c("graph", "Q", "mu", "initial","log.norm.const", "log.prior", "quit"), theta = NULL)

for the Besag spatial and AR1 temporal interaction.

The rest of them have IID parts in so can be made without

athowes commented 3 years ago

Jeff's repo here could be helpful https://github.com/jeffeaton/inla-sandbox/blob/master/inla-internals.md

athowes commented 3 years ago

Rather than try to write a manual AR1 x Besag, I think it'll be easier to write a "manual" Besag x IID then use the group option for AR1. With proper scaling, I think that Besag x IID specified via the group option should be equivalent to a Besag which has copies of the adjacency graph like this:

image

This could be tested.

athowes commented 3 years ago

e.g. the adjacency matrix looks like

image

athowes commented 3 years ago

Need to fix this bug:

 Error in inla.interpret.formula(formula, data.same.len = data.same.len,  : 

The covariate `area_idx' are used in the following f()-terms: 4, 6
Only one is allowed since the naming convention of the f()-terms depends on unique names of the covariates

You can resolve this as follows:
Not allowed: formula = y ~ f(x, <etc>) + f(x, <etc>) + <etc>
Fix: data$xx = data$x; formula = y ~ f(x, <etc>) + f(xx, <etc>) + <etc> 

Can be accomplished using area_idx_copy and sur_idx_copy

athowes commented 3 years ago

Crashed at MOZ

Begin fitting Model 8x.

    GMRFLib version 3.0-0-snapshot, has recived error no [12]
    Reason    : The Newton-Raphson optimizer did not converge
    Message   : Condition `lambda < 1.0 / lambda_lim' is not TRUE
    Function  : GMRFLib_init_GMRF_approximation_store__intern
    File      : approx-inference.c
    Line      : 2953
    GitID     : file: approx-inference.c  1f6a39183ef43d8ef33f10ff3f04fd13f8432758 - Mon Feb 22 21:27:50 2021 +0300

Segmentation fault (core dumped)
[ failed     ]  2021-10-02 19:54:29
[ elapsed    ]  Failed report in 25.26072 mins
 Error in inla.inlaprogram.has.crashed() : 
  The inla-program exited with an error. Unless you interupted it yourself, please rerun with verbose=TRUE and check the output carefully.
  If this does not help, please contact the developers at <help@r-inla.org>. Begin fitting Model 8x.

    GMRFLib version 3.0-0-snapshot, has recived error no [12]
    Reason    : The Newton-Raphson optimizer did not converge
    Message   : Condition `lambda < 1.0 / lambda_lim' is not TRUE
    Function  : GMRFLib_init_GMRF_approximation_store__intern
    File      : approx-inference.c
    Line      : 2953
    GitID     : file: approx-inference.c  1f6a39183ef43d8ef33f10ff3f04fd13f8432758 - Mon Feb 22 21:27:50 2021 +0300

Segmentation fault (core dumped)
[ failed     ]  2021-10-02 19:54:29
[ elapsed    ]  Failed report in 25.26072 mins
 Error in inla.inlaprogram.has.crashed() : 
  The inla-program exited with an error. Unless you interupted it yourself, please rerun with verbose=TRUE and check the output carefully.
  If this does not help, please contact the developers at <help@r-inla.org>. 
athowes commented 3 years ago

Based on early model comparison plots, looks to me like the interactions are improving on things slightly: probably worth keeping in model comparison

athowes commented 3 years ago
athowes commented 2 years ago

See https://github.com/jeffeaton/inla-sandbox/blob/c77d854f43a160841ae7172698e7453e96f08f78/inla-internals.R#L164 for additional information about the interactions between R-INLA's group and extraconstr / constr options.

Background

Currently, space x survey x category random effects are being implemented via (here is Model 6x which has Besag x IID x IID):

f(area_idx_copy, model = "besag", graph = adjM, scale.model = TRUE, group = sur_cat_idx,
  control.group = list(model = "iid"), constr = TRUE, hyper = tau_pc(x = 0.001, u = 2.5, alpha = 0.01))

The challenge here is how to get the right sum-to-zero constraints.

How to do it with simpler random effects

Using the default option constr = TRUE places a sum-to-zero constraint for each value of group. So for simpler age x category random effects

f(area_idx, model = "iid", group = cat_idx, control.group = list(model = "iid"),
            constr = TRUE, hyper = tau_pc(x = 0.001, u = 2.5, alpha = 0.01))

then we get the correct sum-to-zero within each category:

sum_a \alpha_{ak} = 0 \forall k = 1, ..., K.

What we need for the interactions

What is required for the interaction random effects \delta_{itk} is:

sum_{it} \delta_{itk} = 0 \forall k = 1, ..., K.

What I have currently is:

sum_{i} \delta_{itk} = 0 \forall t, k

If there are K categories and T time points then I am getting K x T sum-to-zero constraints: not only does the sum over it have to be zero for all k = 1, ..., K but so does the sum over each i for all t.

One way to implement this would be to get group = cat_idx rather than group = sur_cat_idx. This would be OK for this particular model but would need a custom model for Besag x AR1.

What's required for each of the interaction models

Model Space x Survey x Category Proposed strategy
5x IID x IID x IID Done
6x Besag x IID x IID Use a repeat_matrix to put Besag x IID in the f and let category be group
8x IID x AR1 x IID Not sure.
9x Besag x AR1 x IID Not sure.
athowes commented 2 years ago

Got this error for 6x:

Error in inla(formula, data = df_model, family = "xPoisson", control.predictor = list(link = 1),  : 
  In f(area_sur_idx): 'covariate' must match 'values',  and both must either be 'numeric', or 'factor'/'character'. 

EDIT: fixed below.

athowes commented 2 years ago

Jeff suggests to use separate f terms for each category and share precision parameters between terms using copy as a potential workaround.

athowes commented 2 years ago

There is a discussion on the R-INLA Google group about this here: https://groups.google.com/g/r-inla-discussion-group/c/ypFiyve-LLY

athowes commented 2 years ago

I wrote copy in the above commit but I mean replicate.

replicate allows you to define random effects u_1, ..., u_K such that u_k ~ f(\theta), where \theta is shared between the random effects. For more information, see this section of Virgilio Gómez-Rubio's Gitbook.

I believe what I have implemented is close to being correct, but not totally. Going to try running through the countries to look at model fit criteria when using this almost correct interaction term!

Resources for rgeneric

Downstream issues to fix

[ failed ] 2021-11-29 18:02:48 [ elapsed ] Failed report in 3.214426 secs Error in max_idx[[i]] : subscript out of bounds In addition: There were 22 warnings (use warnings() to see them)



- [ ] `process_variance-proportions` runs but needs to be updated.
athowes commented 2 years ago
#' BWA errored straight away :'(
#' MOZ crashed on 8x
#' MWI crashed on 5x