Open florianhartig opened 4 years ago
OK, I have looked into this. The point for a DWPR regression is that residuals per data point don't make sense, because the weights are necessarily ignored in the simulations (the weighted Poisson does not have an explicit data-generating function), and thus any attempts to build weights on single 0/1 observations are doomed.
What works perfectly fine, however, is to aggregate 0/1 observations and predictions to a grid, create Poisson simulations by hand from that, and read the results back into DHARMa (see example below). If you do this on real data, note that:
Grid size is arbitrary. I would recommend to choose a size large enough that residuals compute fast, and that not all grid counts are zero.
If your environmental variables are available at a finer resolution, upscale predictions from the model by calculating the predictions for all sub-grid values, and sum / mean that.
# Definining incidence on a grid according to Poisson
gridData = expand.grid(x = 1:50, y = 1:50)
gridData$env = gridData$x/50 * gridData$y/50
gridData$incidence = exp(1 + 0.5 * gridData$env)
gridData$counts = rpois(nrow(gridData ), gridData$incidence)
# Creating point observations
pointData <- gridData[rep(row.names(gridData), gridData$counts), 1:3]
pointData$pres = 1
# Add pseudo-absencens randomly
nPseudo = 10000
pseudo = data.frame(x = sample.int(50, size = nPseudo, replace = T), y = sample.int(50, size = nPseudo, replace = T))
pseudo$env = pseudo$x/50 * pseudo$y/50
pseudo$pres = 0
pointData = rbind(pointData, pseudo)
# creating weights
pointData$weight = rep(1.e-6, nrow(pointData))
pointData$weight[pointData$pres == 0] = nrow(gridData) / sum(pointData$pres == 0)
# fitting the model
fit <- glm(pres/weight ~ env, data = pointData, family = poisson, weights = weight)
summary(fit)
# creating predictions and simulations from the predictions, and read in as DHARMa object
gridData$pred = predict(fit, newdata = gridData, type = "response")
sim = replicate(500, rpois(length(gridData$pred), gridData$pred))
library(DHARMa)
res <- createDHARMa(sim, gridData$counts, fittedPredictedResponse = gridData$pred, integerResponse = T)
# plotting residuals
plot(res)
testSpatialAutocorrelation(res, x = gridData$x, y = gridData$y)
Result:
Question from a user is if DWPR, as described in Renner, Ian W., et al. "Point process models for presence‐only analysis." Methods in Ecology and Evolution 6.4 (2015): 366-379, can be used with DHARMa.
This is an example plot from a DWPR model, fitted with MGCV. The mgcv throws the warning what weights are ignored, which is to be expected. I suspect that the pattern originates from the pseudo-absences, which are set to have low weight during the fit, but in general the model will predict a non-random pattern for those, which will not be matched by the data (which is alway zero)
Idea: calculate DHARMa residuals only for presences?