gfalbery / ggregplot

Misc plotting and model summary functions
5 stars 5 forks source link

using BYM2 or spatial random effects with INLAModelSel function #2

Open jdsimkin04 opened 3 years ago

jdsimkin04 commented 3 years ago

Hey! Really love this package and the tutorials you have up on ourcodingclub. I am working through a model selection exercise and I'm specifically using the BYM2 model with R-INLA. I came across your package/tutorial on ourcodingclub and saw you can add the response, covariates and random effects using INLAModelSel function but it doesn't work with the BYM2 model (which has a spatial random effect and an unstructured random effect "iid"). I think this is because when you add the spatial random effect to the INLAModelSel, there isn't an argument to add a graph/spatial weights matrix.

This is what the INLA model would look like:

formula <- as.formula(paste0(resp, " ~ ", paste(covar, collapse = " + "), " + f(idarea, model = 'bym2', graph = g, constr = T, hyper = prior, scale.model = T)"))

inla(formula, family = "poisson", data = chsadf_inla, E = exp, control.compute = list(dic = TRUE))

Then model selection ModelSel <- INLAModelSel(resp, covar, "idarea", "bym2", "poisson", mydata)

Output: Error in INLA::f(idarea, model = "bym2") : The 'graph' has to be provided for model bym2

I think this should be relatively easy to implement? i.e. adding a new argument for the INLA argument 'graph'? Another item would be an offset argument for those using Poisson..

Happy to provide data to work with if necessary!

Thanks!

gfalbery commented 3 years ago

Hi Jonathan, thanks for the kind message! It's really nice to know ggregplot is useful, although it's also a bit daunting because I really made it just for my own reckless purposes. Happy to help sort out a solution for this if it's helping you. Hit me with some data if you like, or let me know if you're getting errors with the pushes I make now.

The reason INLAModelSel doesn't really have any allowance for complex models is because I initially designed it to perform simple model selection on a pre-spatial model, and then to add more complex (spatial) effects in subsequent models. I later ended up trying to choose between various spatial components, which I have mostly been doing with INLAModelAdd instead because it makes a bit more sense as a forward model selection/model addition process. This is all a long way to say that INLAModelAdd is probably a better (or at least more updated) bet if you're looking to do some model selection with INLA. Among other things, it's got an allowance to fit and pick between multiple random effects, which is obviously missing from INLAModelSel.

What might work for you is if I add a ... formal argument to the INLAModelSel function and push it (and I'll mirror any changes with INLAModelAdd too). That way you could put e.g. graph = g, constr = T in the INLAModelSel call and then graph = graph, constr = constr in the variable name for the bym2 effect. I'll try pushing that now, but because INLAModelSel's development has lagged behind my coding ability a bit I'm not sure how well it'll handle the complexity of the string.

Another possibility is as you suggest I could just add a series of options for priors, hyperparameters, etc, just like I've done for the random models -- if the ... option doesn't work I'll play around with some other approaches. I could also combine all the random effects etc into a named list where each component of the list has a load of named components that correspond to the arguments that you stick inside the f(), which might be a more sustainable option for fitting quite complex stuff.

G

On Sat, Apr 10, 2021 at 6:37 PM Jonathan Simkin @.***> wrote:

Hey! Really love this package and the tutorials you have up on ourcodingclub. I am working through a model selection exercise and I'm specifically using the BYM2 model with R-INLA. I came across your package/tutorial on ourcodingclub and saw you can add the response, covariates and random effects using INLAModelSel function but it doesn't work with the BYM2 model (which has a spatial random effect and an unstructured random effect "iid"). I think this is because when you add the spatial random effect to the INLAModelSel, there isn't an argument to add a graph/spatial weights matrix.

This is what the INLA model would look like:

formula <- as.formula(paste0(resp, " ~ ", paste(covar, collapse = " + "), " + f(idarea, model = 'bym2', graph = g, constr = T, hyper = prior, scale.model = T)"))

inla(formula, family = "poisson", data = chsadf_inla, E = exp, control.compute = list(dic = TRUE))

Then model selection ModelSel <- INLAModelSel(resp, covar, "idarea", "bym2", "poisson", mydata)

Output: Error in INLA::f(idarea, model = "bym2") : The 'graph' has to be provided for model bym2

I think this should be relatively easy to implement? i.e. adding a new argument for the INLA argument 'graph'? Another item would be an offset argument for those using Poisson..

Happy to provide data to work with if necessary!

Thanks!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/gfalbery/ggregplot/issues/2, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADQWUG2WXEPMO24JYIE5URDTICEDZANCNFSM42WXEVEA .

-- Greg Albery Postdoctoral Fellow, Bansal Lab Georgetown University Washington, DC (Working remotely from London, UK)

@.*** // gregalbery.me google scholar https://scholar.google.co.uk/citations?user=0cLDqU8AAAAJ&hl=en || github https://github.com/gfalbery || twitter https://twitter.com/gfalbery || researchgate https://www.researchgate.net/profile/Gregory_Albery

(I work at all hours but I never expect replies outside 9-5 weekdays)

jdsimkin04 commented 3 years ago

Awesome! Thanks for the help, I really appreciate it! I'm recklessly working through model selection as well and when I came across your coding club rmd, I was replicating a lot of that work manually. My PhD colleague will be running through a similar exercise soon so I'd love to show him your package (we both use BYM2)... maybe you can get some more mileage out of your package!

I didn't know about the INLAModelAdd but that would be of interest. I'm working backwards manually now but can go forwards as well.

I'm not entirely sure how to build in the arguments in the INLAModelSel and Add... if I add some data here, do you mind giving it a try and/or showing me how to specify the arguments in both INLAModel Sel and Add? I tried a few different things but ran into this error a few times when I ran INLAModelSel..

Error in INLAModelSel("cases", c("x1", "x2", "x3", : object 'ScaleVariables' not found

Ok so I've added a few files w/ the data, hopefully this works for you!

mydata <- read.csv("example_data_ggregplot.csv") g <- inla.read.graph(filename = "ch5_chsa_weight.adj")

my prior prior <- list( prec = list( prior = "pc.prec", param = c(0.2 / 0.31, 0.01), initial = 5), phi = list( prior = "pc", param = c(0.5, 2 / 3), initial = -3) )

Here's my formula formula <- response ~ x1 + x2 + x3 + x4 + x5 + x6 + f(idarea, model = "bym2", graph = g, constr = T, hyper = prior, scale.model = T)

My INLA model model <- inla(formula, family = "poisson", data = mydata, E = offset, #offset, control.predictor = list(compute = TRUE), control.compute = list(dic=TRUE, mlik=TRUE,cpo=TRUE, config = T, waic = T), control.inla = list(strategy = "laplace", npoints = 21))

The extra control options I've added are a bit much, I've taken them from the INLA FAQ but perhaps w/ the '...' I could add them in? I'm actually not sure how "..." works. The very least, having "bym2", adding in the offset and graph and constr (and I guess scale.model?) would be huge help.

Thanks!!

J example_data_ggregplot.csv ch5_chsa_weight.adj.zip