Open florianhartig opened 3 years ago
Hi Florian,
I was looking at this and the weights are not ignored, but passed to family$rd() or family$qf()... For instance:
> gaulss()$rd
function (mu, wt, scale)
{
return(rnorm(nrow(mu), mu[, 1], sqrt(scale/wt)/mu[, 2]))
}
uses the weights.
Matteo
Yes, for gaussian / binomial, weights have a particular meaning in the likelihood / data-generating model, but for Poisson, the weights are just weights on the likelihood and have no correspondence to any data-generating model (effectively, this is a pseudo-likelihood). In this case, simulated data will not always look like observed data (because the weights cause the fit to disregard particular data points).
So, Effectively, weights in regression packages in R are used in 3 different ways:
In retrospect, I think it was a mistake from the R programmers to overload the weight argument in glm with these different meanings, it would have been much better to have separate variable names for all three options.
Anyway, what I would suggest is to throw a warning for all families that are using weights on the likelihood only, without a data-generating model. This is for sure so for the Poisson, not sure about all the other extended families.
Hi Matteo,
I am considering switching to mgcViz:::simulate for simulating from gam objects in DHARMa, see https://github.com/florianhartig/DHARMa/issues/309.
One suggestion: when fitting models with weights for other than binomial and gaussian families, I assume that weights are simply applied to the likelihood when fitted, but ignored in the simulations. I think it would be better to throw a warning then (currently, no warning is returned).
Cheers, F