florianhartig / DHARMa

Diagnostics for HierArchical Regession Models
http://florianhartig.github.io/DHARMa/
200 stars 21 forks source link

Diagnostic plots for model using glmmTMB and Beta distibution #405

Open stephrivest opened 3 months ago

stephrivest commented 3 months ago

Hi, I have fitted a glmm (glmmTMB) on the proportional cover of plant species as a function of the amount of urban land (continuous; 0 to 1) and plant lifespan (categorical; 2 levels). I'm mainly interested in an interaction effect between urban land and lifespan. My response is a continuous proportion (0 to 1) so I chose the beta family. I have 214,000 observations. My response includes many 0's (93% of obs are zero) and a few 1's, so I applied a transformation (Smithson & Verkuilen 2006) prior to fitting the model. I added three random effects representing the quadrat ID (1m x 1m areas where plants were sampled), the site ID (the entire habitat), and the species ID. Quadrats were nested within sites, but were coded uniquely.

Here is my model: m1<-glmmTMB(PropCover_transformed ~ Urban + Lifespan + Urban*Lifespan + (1|speciesID) + (1|siteID) + (1|quadratID), family=beta_family(link="logit"), data=my.data)

I used DHARMa to check whether my model was misspecified, but I'm having trouble interpreting the results. I don't think I need to be worried about the significant dispersion test since I think the beta already includes variable dispersion? I'm more concerned about the results of the K.S. test, but do not know how to approach finding a solution. I am also unsure how to interpret residuals vs. predicted plot which just looks like a homogenous grey cloud.

Residual plot from DHARMa: Jeremie_beta glmm1 resid plot

Does the beta distribution simply not fit very well to my data (perhaps because of the quantity of zeros)? Is there anything I can do to improve the specification of this model? Would it be better to consider applying a transformation and fitting an LMM instead?

Any advice would be appreciated! Thanks

florianhartig commented 3 months ago

Hello,

yes, the distribution looks completely off. Given the large magnitude of the pattern, I would say

You can use the function testZeroInflation for Zeros and testGeneric for ones, see examples in help ?testGeneric for an example to test for one-inflation.

Best Florian

florianhartig commented 3 months ago

p.s.: ich nothing comes out of that, I'm happy to have a look at the data if you sent it to me either here or privately via email (please as a minimal reproducible example, see https://github.com/florianhartig/DHARMa/wiki/Asking-for-help)

stephrivest commented 3 months ago

Hello and thank you for your reply! I'd be happy to send you the data privately as the file is too large to attach here. Because I used the Smithson and Verkuilen transformation, I should no longer have any zeros or ones. However, prior to applying the transformation, I had 93% zeros in my dataset (very right-skewed distribution).

Here is the code I used to fit my model...

#applied transformation where n=214,000 is the sample size
BergFinal$PropCover_transf<-((BergFinal$PropCover*213999)+0.5)/214000

#fit the model
m1<-glmmTMB(PropCover_transf ~ PropUrban500 + LifespanAnnualBiennial + PropUrban500*LifespanAnnualBiennial + (1|SpeciesName_WFOaccepted) + (1|siteID) + (1|quadratID), family=beta_family(link="logit"), data=BergFinal)

#simulated residuals and plotted them
res1 <- simulateResiduals(m1)
plot(res1, rank = T)
florianhartig commented 3 months ago

Hi,

thanks, I received your data, but I basically don't have to look at it because the case is quite clear - if you have 93% of values with zeros in your data and then shift this by

((BergFinal$PropCover*213999)+0.5)/214000

essentially to a very small value, you have a massive peak at this small value, likely doesn't fit to the spread of the beta for the other points. These kind of transformations don't really work if you have strong 0/1 inflation (at least if you want that your residuals spread according to a beta).

Instead, try your luck with the zero-inflated Beta in glmmTMB.

Cheers Florian

stephrivest commented 3 months ago

Hi Florian, I thought about using the zero-inflated Beta in glmmTMB, but I also have a few 1's in the dataset (only n=8). So when I ran the zero-inflated model, I got the error: "Error in eval(family$initialize) : y values must be 0 <= y < 1". In this case, would you recommend applying a transformation to adjust the 1's, then running the zero-inflated model? If so, what kind of transformation would you recommend that would apply to the 1's, but not the 0's? Thanks!

florianhartig commented 3 months ago

Hi Stephanie,

regarding glmmTMB questions, you have to ask the glmmTMB crew. In principle it's possible to have zero and one inflation, and I think brms has this implemented, but not sure if it can be done in glmmTMB. As a next best solution, a transformation is certainly possible.

Cheers Florian