Diagnostic plots for model using glmmTMB and Beta distibution

stephrivest commented 3 months ago

Hi, I have fitted a glmm (glmmTMB) on the proportional cover of plant species as a function of the amount of urban land (continuous; 0 to 1) and plant lifespan (categorical; 2 levels). I'm mainly interested in an interaction effect between urban land and lifespan. My response is a continuous proportion (0 to 1) so I chose the beta family. I have 214,000 observations. My response includes many 0's (93% of obs are zero) and a few 1's, so I applied a transformation (Smithson & Verkuilen 2006) prior to fitting the model. I added three random effects representing the quadrat ID (1m x 1m areas where plants were sampled), the site ID (the entire habitat), and the species ID. Quadrats were nested within sites, but were coded uniquely.

Here is my model: m1<-glmmTMB(PropCover_transformed ~ Urban + Lifespan + Urban*Lifespan + (1|speciesID) + (1|siteID) + (1|quadratID), family=beta_family(link="logit"), data=my.data)

I used DHARMa to check whether my model was misspecified, but I'm having trouble interpreting the results. I don't think I need to be worried about the significant dispersion test since I think the beta already includes variable dispersion? I'm more concerned about the results of the K.S. test, but do not know how to approach finding a solution. I am also unsure how to interpret residuals vs. predicted plot which just looks like a homogenous grey cloud.

Residual plot from DHARMa: Jeremie_beta glmm1 resid plot

Does the beta distribution simply not fit very well to my data (perhaps because of the quantity of zeros)? Is there anything I can do to improve the specification of this model? Would it be better to consider applying a transformation and fitting an LMM instead?

Any advice would be appreciated! Thanks

florianhartig commented 3 months ago

Hello,

yes, the distribution looks completely off. Given the large magnitude of the pattern, I would say

either something is going wrong here on the technical side,
or your data doesn't fit to a beta at all. Possible reason could be a large zero or one inflation, which is common in the beta.

You can use the function testZeroInflation for Zeros and testGeneric for ones, see examples in help ?testGeneric for an example to test for one-inflation.

Best Florian

florianhartig commented 3 months ago

p.s.: ich nothing comes out of that, I'm happy to have a look at the data if you sent it to me either here or privately via email (please as a minimal reproducible example, see https://github.com/florianhartig/DHARMa/wiki/Asking-for-help)

stephrivest commented 3 months ago

Hello and thank you for your reply! I'd be happy to send you the data privately as the file is too large to attach here. Because I used the Smithson and Verkuilen transformation, I should no longer have any zeros or ones. However, prior to applying the transformation, I had 93% zeros in my dataset (very right-skewed distribution).

Here is the code I used to fit my model...

#applied transformation where n=214,000 is the sample size
BergFinal$PropCover_transf<-((BergFinal$PropCover*213999)+0.5)/214000

#fit the model
m1<-glmmTMB(PropCover_transf ~ PropUrban500 + LifespanAnnualBiennial + PropUrban500*LifespanAnnualBiennial + (1|SpeciesName_WFOaccepted) + (1|siteID) + (1|quadratID), family=beta_family(link="logit"), data=BergFinal)

#simulated residuals and plotted them
res1 <- simulateResiduals(m1)
plot(res1, rank = T)

florianhartig commented 3 months ago

Hi,

thanks, I received your data, but I basically don't have to look at it because the case is quite clear - if you have 93% of values with zeros in your data and then shift this by

((BergFinal$PropCover*213999)+0.5)/214000

essentially to a very small value, you have a massive peak at this small value, likely doesn't fit to the spread of the beta for the other points. These kind of transformations don't really work if you have strong 0/1 inflation (at least if you want that your residuals spread according to a beta).

Instead, try your luck with the zero-inflated Beta in glmmTMB.

Cheers Florian

stephrivest commented 3 months ago

Hi Florian, I thought about using the zero-inflated Beta in glmmTMB, but I also have a few 1's in the dataset (only n=8). So when I ran the zero-inflated model, I got the error: "Error in eval(family$initialize) : y values must be 0 <= y < 1". In this case, would you recommend applying a transformation to adjust the 1's, then running the zero-inflated model? If so, what kind of transformation would you recommend that would apply to the 1's, but not the 0's? Thanks!

florianhartig commented 3 months ago

Hi Stephanie,

regarding glmmTMB questions, you have to ask the glmmTMB crew. In principle it's possible to have zero and one inflation, and I think brms has this implemented, but not sure if it can be done in glmmTMB. As a next best solution, a transformation is certainly possible.

Cheers Florian

florianhartig / DHARMa

Diagnostic plots for model using glmmTMB and Beta distibution #405