Open akhileshtayade opened 8 months ago
The distribution of your predictor variables isn't (or shouldn't be) relevant.
As overkill, you could consider a model with response ~ poly(X, Z, W, degree = 2) + (1 | group)
; this will estimate the full second-order polynomial model with respect to all predictors (linear terms of X, Z, W, also quadratic terms in X, Z, W and all interactions).
Adding a square-root term to fix the model diagnostics is not crazy, but seems like it would make the model very hard to interpret.
Since W
varies within group, you could try using (1+W|group)
rather than (1|group)
.
You can definitely do what Ben suggests, but personally, I would be satisfied with your second model. It seems a clear improvement over the first model, and while there is still a bit of misfit there (mainly in the dispersion / distribution wrt z), this seems to me rather minor and I don't imagine that it would have strong effects on any of the relevant model outputs. See comments in the the vignette, section https://cran.r-project.org/web/packages/DHARMa/vignettes/DHARMa.html#general-remarks-on-interperting-residual-patterns-and-tests
A residual pattern does not indicate that the model is unusable: Conversely, while a significant pattern in the residuals indicates with good reliability that the observed data did likely not originate from the fitted model, this doesn’t necessarily imply that the inferential results from this wrong model are not usable. There are many situations in statistics where it is common practice to work with “wrong models”. For example, many statistical models used shrinkage estimators, which purposefully bias parameter estimates to certain values. Random effects are a special case of this. If DHARMa residuals for these estimators are calculated, they will often show a slight pattern in the residuals even if the model is correctly specified, and tests for this can get significant for large sample sizes. For this reason, DHARMa is excluding RE estimates in the predictions when plotting res ~ pred Another example is data that is missing at random (MAR). Because it is known that this phenomenon does not create a bias on the fixed effect estimates, is is common practice to fit this data with standard mixed models. Nevertheless, DHARMa recognizes that the observed data looks different than what would be expected from the model assumptions, and flags the model as problematic (see here).
Once an residual effect is statistically significant, look the magnitude to decide if there is a problem: finally, it is crucial to note that significance is NOT a measures of the strength of the residual pattern, it is a measure of the signal/noise ratio, i.e. whether you are sure there is a pattern at all. Significance in hypothesis tests depends on at least 2 ingredients: strength of the signal, and the number of data points. If you have a lot of data points, residual diagnostics will nearly inevitably become significant, because having a perfectly fitting model is very unlikely. That, however, doesn’t necessarily mean that you need to change your model. The p-values confirm that there is a deviation from your null hypothesis. It is, however, in your discretion to decide whether this deviation is worth worrying about. For example, if you see a dispersion parameter of 1.01, I would not worry, even if the dispersion test is significant. A significant value of 5, however, is clearly a reason to move to a model that accounts for overdispersion.
Important conclusion: DHARMa only flags a difference between the observed and expected data - the user has to decide whether this difference is actually a problem for the analysis!
Hi Florian,
I have been trying to understand how to consider non-uniform residuals in y-direction against a predictor while fitting models in the context of low sample size.
I have fitted a Beta GLMM using
glmmTMB
to the data that consist of 120 observations from 60 groups; each group contribute two observations to the data. The model formula can be described as following:I then simulated
fit.beta
unconditionally for 1000 iterations (simulation output represented below assimulationOutput
), and plotted residuals against fitted values as well as against each predictor. Residuals vs. predictor plots did not show any evidence of non-uniformity in y-direction (figures not shown) except for the predictorX
:Figure 1: (From left to right) First two plots are the result of
plot(simulationOutput)
. The third one is generated by plotting residuals against (rank transformed)X
and the fourth one is generated by plotting residuals directly againstX
.From the last plot, I inferred two things:
X
has right-skewed distribution (which also became evident from the histogram ofX
).X
might help the model fit.Therefore, I fitted another model
fit.beta.sq
to the data and simulated it unconditionally for 1000 iterations (simulation output represented below assimulationOutput.sqr
):Figure 2: (From left to right) Plots generated using
simulationOutput.sq
. The second and the fourth plot display residuals against untransformedX
andX^2
, respectively.Adding non-linear term (square root of
X
) seemed to have resolved the issue at first glance, however, when I rechecked the residuals against remaining predictors, residuals againstZ
appeared non-uniform in y-direction:Figure 3: (From left to right) First two plots are generated using
simulationOutput
and the remaining two plots are generated usingsimulationOutput.sqr
. The second and the fourth plot display residuals against untransformedZ
fromsimulationOutput
andsimulationOutput.sqr
, respectively.The model
fit.beta.sqr
also seemed to have been correctly specified:Figure 4: Plot generated using
plot(simulationOutput.sqr)
.And both simulation outputs do not show any pattern when grouped according to
group
:Figure 5: Grouped residuals from
simulationOutput
(left) andsimulationOutput.sqr
(right). Number of groups = 60.Finally, both
X
andZ
have right-skewed distribution:Figure 6: Histogram of
X
andZ
. Since onlyW
changed across two observations of each group, histograms used repeated values ofX
andZ
from the data i.e. $X = \{ x^1_1, x^1_2, x^2_1, x^2_2, \dots, x^i_1, x^i_2, \dots, x^{60}_1, x^{60}_2 \}, i \in \{ 1, \dots, 60 \}$.Given this information, could you please make any suggestions as to what model improvements I should consider?