Closed florianhartig closed 3 years ago
I suspect that these differences are due to randomization. Note that DHARMa residuals for any integer-valued function involve a randomisation procedure to smoothen out the discrete nature of the data. The default procedure is the PIT procedure, see help. Thus, in principle, each DHARMa residuals calculation would lead to slightly different residuals. Because this led repeatedly to confusion among DHARMa users, I fixed the random seed so that that effectively, you will get the same result with each run you do (you can switch this off, see help)
However, if you change the order in which residuals are calculated, you are tricking this system, i.e. the randomization will be different because different random numbers are used for each residuals. I think this is what happens here, as from the coordinates, it seems that g$plotID has a different ordering than g$group.
You are correct that the two variables have different ordering. When I sort the plotID by lat/lon first, both versions give the same result.
This suggests that recalculateResiduals() has a random element (not simulateResiduals) -- is that correct?
It is troubling to me that different ordering gives quite different p-values. I am wondering if you recommend using a higher number of simulations in the original simulateResiduals call? Would this reduce the differences I get based on ordering?
Update: I tried upping the number of simulations in from the default (250) to 1k and 10k -- but I still get very different results depending on which variable I use to group the plots.
n=1000 based on plotID, p= 0.9324149278 based on lat/lon, p= 0.03198392638
n=10000 based on plotID, p=0.5529508359 based on lat/lon, p=0.9016349029
?simulateResiduals says that for n: "The smaller the number, the higher the stochastic error on the residuals. Also, for very small n, discretization artefacts can influence the tests. Default is 250, which is a relatively safe value. You can consider increasing to 1000 to stabilize the simulated values.".
Therefore, I thought I would see a reduction in the difference between the p-values. Let me know what you think.
Also, please let me know if you have advice on getting a result that is less sensitive to ordering.
Both recalculate and simulateResiduals have 2 random components:
You can make the stochasticity of 1) arbitrary small by increasing n in simulareResiduals. For continuous distributions, you will therefore get unique quantile residuals.
For discrete-valued distributions, there is the component of 2) which is independent of n and basically only dependent on the size of the data. The larger the data, the less stochasticity you have.
This is an inherent property of all randomized quantile residuals, Bayesian or frequentist, and not really a problem imo, as Type I error rates are still controlled as long as you don't "play around" with the randomisation, i.e. you shouldn't try out different randomisation and pick the one with the lowest / highes p-value, but rather take the first one and just roll with it. Yes, there will be some error in individual cases, but overall, this will deliver the right result. It's the same as running an experiment - if you would run it again, p-values would also differ.
p.s.: see also https://github.com/florianhartig/DHARMa/issues/38
Question from a DHARMa user via email: