JenniNiku / gllvm

Generalized Linear Latent Variable Models
https://jenniniku.github.io/gllvm/
48 stars 19 forks source link

ordination with predictors #101

Closed julia5cs closed 1 year ago

julia5cs commented 1 year ago

Hi,

I’m trying to create an ordination plot with predictor variables using the ordiplot function in gllvm, but every time I run the model and create the corresponding ordiplot it gives me very different results (different length and significance of the arrows). Is that normal? Or how much variation is reasonable/expected? Could there be an issue with the way the model is specified?

I’m a beginner in R and gllvm so I would really appreciate your advice.

Thank you!

My model: ord.env <- gllvm(zb$y2, zb$x, family = "negative.binomial", studyDesign = sDesign, lv.formula = ~ tmp + sal + oxy + station.depth + sed_softness_index + offset(log(zb$a)), row.eff = ~ (1|Cluster), num.lv.c = 2)

zb$y2 = matrix with abundance values for 36 different species zb$x = matrix for the 5 environmental covariates (scaled) sDesign <- data.frame(Cluster=zb$env$Station.cluster) zb$a = matrix with area values for every site

Ordiplot: ordiplot(ord.env, biplot = TRUE)

BertvanderVeen commented 1 year ago

Hello! Thanks for posting your question. The models in gllvm generally suffer from this, so there is an option in the package to make that a bit easier for you; n.init. This is also described in Niku et al. 2019 if you are interested. What this does is re-fit the model multiple times and pick the "best". To prevent having to do that every time, my advice is to do that once with a large number and then save the model as an R object, and load it if you want to make plots and such.

On another note, the offset should not be in the model formula, but in the "offset" argument. So, try the following and see if things improve:

ord.env <- gllvm(zb$y2, zb$x, family = "negative.binomial",
studyDesign = sDesign,
lv.formula = ~ tmp + sal + oxy + station.depth + sed_softness_index,
offset = matrix(log(zb$a),nrow=length(zb$a),ncol=ncol(zb$y2)),
row.eff = ~ (1|Cluster),
num.lv.c = 2,
n.init = 5)

Good luck! Let me know if you need more help!

julia5cs commented 1 year ago

Hi,

Thanks so much for your fast response! I have tried what you proposed and the model is more stable now. However, I still see some variability when generating the ordination plots with the predictors (but less than before). I have attached below an image with the plots that result from two independent runs of the model with the exact same parameters.

Just to have a better understanding of it, I was wondering what can be the causes of this variation. Could the characteristics of my data be contributing to it? For example, small effects in the environmental covariates? Overall, would you recommend increasing more the n.init number? And would you recommend using the AIC values as criteria for selecting the best model for the ordination?

ordination with predictors

BertvanderVeen commented 1 year ago

Interesting! Could you provide a little more information, eg a print screen of the summary of both models?

julia5cs commented 1 year ago

Yes! These are the summary and the coefficients for the LV predictors for each (the list with the coefficients of the predictors for every species is a bit too long to attach).

Thank you,

ordination with predictors summary

BertvanderVeen commented 1 year ago

OK. There is little to be done; sensitivity of the model to the initial values does occur. Either how, you should pick the model here with the lowest AIC! It looks like the answers from either model would be similar, but the estimated latent variables differ a bit. Just to double check; does zb$x include the variable that is in sDesign by any chance? Because that would definitely cause trouble!

julia5cs commented 1 year ago

Okay. No, the variable in sDesign is not in the matrix with the environmental covariates. Thank you so much for your time, really appreciate your help! Now I understand better how the model works :)