paul-buerkner / brms

brms R package for Bayesian generalized multivariate non-linear multilevel models using Stan
https://paul-buerkner.github.io/brms/
GNU General Public License v2.0
1.28k stars 184 forks source link

multivariate latent variable model (factor analysis) in brms #1543

Closed pcinereus closed 11 months ago

pcinereus commented 1 year ago

Hi, I have read the excellent Estimating Multivariate Models with brms as well as Latent Variable Modelling in brms and have attempted to use the later to explore model based ordination.
For education and testing purposes, I am looking to use the spider abundance data used in examples in the R packages mvabund and ecoCopula. I have also explored pure STAN options, such as Identifiability of GLLVM (factor analytic model). Whilst the STAN option runs very quickly and yields vaguely sensible results, my attempts at a brms based approach are not working at all.

Following is a very minimal script. Note, for brevity I have omitted prior definitions. I appreciate that more informative priors might well help substantially, however, all the attempts I have made have not made substantial gains. Note also that I have opted for a single chain to assist with issues of identifiability.

Any thoughts would be greatly appreciated.

library(brms)
library(mvabund)
data(spider)
spider.abund <- spider$abund |>
  as.data.frame() |>
  mutate(X1 = as.numeric(NA),
         X2 = as.numeric(NA))
form1 <- bf(Alopacce ~ 0 + mi(X1)+ mi(X2), family = negbinomial())
form2 <- bf(Alopcune ~ 0 + mi(X1)+ mi(X2), family = negbinomial())
form3 <- bf(Alopfabr ~ 0 + mi(X1)+ mi(X2), family = negbinomial())
form4 <- bf(Arctlute ~ 0 + mi(X1)+ mi(X2), family = negbinomial())
form5 <- bf(Arctperi ~ 0 + mi(X1)+ mi(X2), family = negbinomial())
form6 <- bf(Auloalbi ~ 0 + mi(X1)+ mi(X2), family = negbinomial())
form7 <- bf(Pardlugu ~ 0 + mi(X1)+ mi(X2), family = negbinomial())
form8 <- bf(Pardmont ~ 0 + mi(X1)+ mi(X2), family = negbinomial())
form9 <- bf(Pardnigr ~ 0 + mi(X1)+ mi(X2), family = negbinomial())
form10 <- bf(Pardpull ~ 0 + mi(X1)+ mi(X2), family = negbinomial())
form11 <- bf(Trocterr ~ 0 + mi(X1)+ mi(X2), family = negbinomial())
form12 <- bf(Zoraspin ~ 0 + mi(X1)+ mi(X2), family = negbinomial())
formX1 <- bf(X1 | mi() ~ 0)
formX2 <- bf(X2 | mi() ~ 0)

fact.form <- form1 + form2 + form3 + form4 + form5 + form6 + form7 + form8 + form9 +
     form10 + form11 + form12 +
     formX1 + formX2 + set_rescor(rescor = FALSE)

mod <- brm(fact.form,
     data =  spider.abund,
     iter =  5000,
     warmup =  1000,
     chains =  1, cores =  1,
     thin =  1,
     control = list(adapt_delta = 0.99, max_treedepth = 20),
     backend =  'cmdstanr')

summary(mod)
paul-buerkner commented 11 months ago

Hey, would you mind asking this question on Stan discourse? Just a note that in brms 3.0 we will be developing a more native (less awkward) approach to latent variable modeling in brms that does not need the missing values interface.