Closed dmi3kno closed 9 months ago
Thanks!
A practical consideration is that any vignette/article that uses (directly or indirectly e.g. via bru()
) must be placed in the vignettes/articles
folder. That way, they are built for the website by an automated github action that has access to INLA, but the package doesn't include it directly; only via an automated link in the website_examples vignette, and a manually added menu link in _pkgdown.yml
Sure I will relocate it. I have been just following your bru_mapper
vignette. I thought you move them after they mature
The bru_mapper
vignette can be an ordinary vignette because it doesn't use the INLA package.
I fixed some of the comments. The units on prediction plots are all over the place (Poisson is 1/1000s, ZIP in 1000s and ZAP in 100s). Clearly something wrong with the models.
I feel bad you are spending time on this. Perhaps one of your students could have a look. Thank you very much!
The setup code sets eval=FALSE
,preventing the code from actually running; it seems to work fine with eval=TRUE though.
Only the final model seems to produce sensible results here; even the second one (that's said to be much better than the first) get an odd hotspot. This is likely due to the vegetation factor covariate; it's too sensitive to local variation in these model I think. Would probably be better to exclude that covariate. Perhaps include one of the other continuous covariates instead, e.g. waterdist, heat, or slopeangle.
I like that we have a factor and a continuous here. Ths model does not need to be perfect, it needs to be illustrative.
Does the centering of continuous raster matter? Is there a bru_mapper
that ensures that the continuous raster covariates fit nicely?
The problem with vegetation is that it to many and localised values, making it sensitive to the behaviour at just a small set of locations. Setting the precision parameter to a larger value might help; hyper=list(prec=list(initial=..., fixed=TRUE))
.
For linear models, centering mostly changes the intercept interpretation (and removes colinearity with the intercept) but shouldn't be important. In particular, inlabru cannot and should not do its own centering (since it doesn't know if it will be involved in a non-linear expression). Even for linear models, data should be intentionally centred instead of auto-centred, with the possible exception of systems that can then undo data-dependent centring, and/or keep track of the centring and apply it to newdata
in predictions. I sometimes see people calling scale()
directly in model formulas, which basically makes the parameter estimates non-interpretable, and automated prediction impossible.
#> Unknown keyword in `hyper' ` prec '. Must be one of theta theta1 prob logit probability
hyper for the vegetation() specification, not the observation model.
Further on covariate centring&scaling: it is good if the scale of the covariates is around "1" though, as the internal algorithms are sometimes adjusted to be accurate in those cases.But that's best done as a preprocessing step so one can have full control and knowledge. (e.g. converting elevation in metres into kilometres when dealing with temperature data across the globe).
hyper for the vegetation() specification, not the observation model.
This makes a huge difference, I feel. Not sure the value I picked is good. Feels rather arbitrary.
I'm doing some changes/corrections to the vignette, to compute the correct/comparables posterior predictors, and fix the zap model; it was using the same intercept for both the "present" and "count" parts, for example. Also adding diagnostic plots. Also found why the vegetation precision had such a big effect; you had fixed the precision to exp(20), i.e. variance $\approx 0$, which was a bit more extreme than I had in mind.
In the ZAP model, the same components, with the same scaling, were used in both predictors, but since they are on different scales, and are intended to capture different aspects, that's a likely problem. I'm testing a bunch of local changes now that should make things a bit clearer, I hope.
In the ZAP example, I showed you they also use the same components, but explain that the SPDE could be different. Looking forward to your edits!
Even if some covariate might be ok with the same coefficient, the intercept definitely needs to be different.
Turns out one of the issues with the vegetation covariate is that the first level is also the most rare (just 48 boxes has that value), which messes up "factor_contrast", as the differences to the reference level are uncertain. Using "factor_full" with a constr = TRUE
may be a safer option. Unfortunately level reordering seems complicated; I couldn't adapt the options given at https://stackoverflow.com/questions/72723613/how-can-i-change-the-ordering-of-levels-for-a-categorical-spatraster-in-terra
In the current version, the ZIP model actually does worse than plain Poisson, but the ZAP model does do better. It's currently plotting and assessing the posterior median instead of the posterior mean, to avoid the artificial hotspot in the posterior mean that I think comes from the rare vegetation type and $E(\exp(X))\approx\exp(E(X) + V(X)/2)$;when the uncertainty is high, the expectation also becomes high. Quantiles are more stable.
Made some small corrections to the text and organized the plots with patchwork. Couple of questions:
The predictors work on different scales, so need to be scaled differently. Since the vegetation was only used in the "present" model, that only leaves the "field" component as a component with more than one parameter, so that one was made into a copy model, so inla estimates a scaling parameter. (I actually tried that for elevation as well, but I think "linear" models can't be "copy" models; the error message didn't say that directly though, so there may have been another issue.
I also found a bug in the summary.component() code, making it fail when presenting info about components with "weights" (I experimented with using the component weights argument to scale the elevation covariate effect I km instead of metres. The estimation worked fine but the summary() call wouldn't work.) I know how to fix it.
The pred_var calculations aren't quite right; they don't fully take into account the zero-inflation part. Similar with pit; I just removed the zero counts, but that's not quite right either.
presence
column) and ZIP gets only non-zero rows (counts
column). zeroinflatedpoisson0
is treating zeros. I believe it is zero that is a success there, not one, so the probability should be complimentary in the prediction.
"Probability of success is observing a zero in this model"
I was referring to the pit assessment plotting code afterwards.
Check inla.doc() for the model for the parameterisation.
Ah, did I just use pnorm() in the predict call? Verify with inla.doc() which quantity should multiply the truncated Poisson part.
Also, if I used pnorm should have used plogis! And if it's in the wrong direction, should add lower.tail=FALSE.
Yeah, It is written plogis, but should be for complimentary tail.
$$ \text{Prob}(y\vert\dots)=p\times 1_{y=0}+(1-p)\times \text{Poisson}(y\vert y>0) $$
where $p=\text{logit}^{-1}(\theta)$
I have these equations in the vignette exactly this purpose.
Looks surprisingly good
# Mean Dawid-Sebastiani scores
c(mean(lambda_poi$DS), mean(lambda_zip$DS), mean(lambda_zap$DS)) %>%
set_names(c("Poisson", "ZIP", "ZAP"))
#> Poisson ZIP ZAP
#> 2053.304 2055.336 1204.949
Some very bright spots appear. Need to tweak the prior for the vegetation, as you suggested earlier.
I've been going through the code and updating the score code a bit, and when I added a plot of the probabilty used in the lambda-prediction, it was larger where there are no points. The original code was actually right. The binomial part estimates the "probability of presence", lets call it p. The ZAP part estimates q="probability of absence". So 1-p=q, and the original lower-tail=TRUE code was actually correct! I'll change it back and do some more diagnostics.
Hmm, I still don't quite understand. It can very well be that we need a probability of absence binomial. This is how I read the math $p\times1_{y=0}$
The binomial part of the third model is a presence/absence model, and estimates the presence probability. In the zap model parameterisations, the parameter is the "probability of zero", i.e. the opposite, so when the formula says (1-p)xPoisson, 1-p is actually the presence probability. Take a look at the new version where I've added a plot of the presence probability.
The new version also has sensible score interpretation; the ZAP MAE score is slightly lower than the others, the RMSE is slightly higher, but the DS score is clearly lower, indicating it captures uncertainty/variability better.
Note: I haven't fully corrected the prediction variance calculations, so pred_var might not be fully accurate.
Sorry, I was a bit unclear, as the zap parameter is fixed to zero in our case. What I should have said was that the binomial part estimates p=P(presence). In zap part we estimate lambda for Poisson(y|y>0). That lead me to $p\lambda$. But it's a truncated Poisson, so that's not quite right either. It should be $E(count) = E[E(count|present) I(present)] = p E(count|count>0)$, which isn't quite $p\lambda$. I'll work out what that conditional expectation is and update the code.
The correct prediction expectation is $\frac{p \lambda E}{1- \exp(-\lambda E)}$, where E is the effort/exposure/area. [Edited: corrected]
For ZIP
$$ \begin{gathered} E(count)=(1-p)\lambda \ Var(count)= (1-p)(\lambda+p \lambda^2) \end{gathered} $$
For ZAP
$$ \begin{gathered} E(count)=\frac{1}{1-\exp(-\lambda)}p\lambda \ Var(count)= \frac{1}{1-\exp(-\lambda)} p(\lambda+p \lambda^2)-\left(\frac{1}{1-\exp(-\lambda)}p\lambda\right)^2 \end{gathered} $$
And of course $\text{logit}(p)=\alpha_0+\alpha_iX_B$ and $\log(\lambda)=\beta_0+\beta_iX_Z$.
Edit: You are right, should be $\lambda E$ everywhere
OK, thanks! I've modified most of the code already to focus on E(count) instead of lambda. Will modify the ZIP part as well and then commit.
This is one of those things you wish you did not have to remember. Wouldn't it be cool if predict
was able to correctly identify the expectations and variance for each type of model?
Should we put the expectation/variance math into the vignette? I committed
Yes, I've done work on this before, an the plan was to have helper functions of some sort. Similar issues arise in all posterior data level predictions.
I just went along with this ZAP hackery (fixing the parameter), but there's now zeroinflatedcenpoisson0
in INLA. There are no examples or documentation anywhere, but I wonder if that's the model we actually need.
I have my concerns regarding the appropriateness of RMSE and MAE as a metric for counts.
No the "cen" models look like they are just version that allow censored observations like in survival analysis; I don't think that helps us here as we want truncation instead.
I've pushed my changes but I'm sure I made some mistakes. Will be busy until Wednesday.
Thank you! This is a lot for me to chew on!
I'm taking another quick look to see if I can spot the bug that made the zip prediction variance invalid.
Found it and am fixing it.
The rare vegetation types are an issue...
The new version has the bugs fixed, and also n.samples=2500
to hopefully avoid the Monte Carlo error of the prediction confusing things. Not sure how to interpret the results yet.
I think one should assess predictions of "count>0" and "count|count>0" separately, for all three models. The mixing of zero-effects and other effects in the score calculations confuses things. The ZAP model is clearly doing something better than the others, but the scores aren't clear about it at all.
I picked out two classes of vegetation that make the most sense. Mean DS is now showing improvement of ZAP over the other models.
I think our spatial field is a bit too focused, so remote nests get picked out by DS metric (showing as bright pixels). Perhaps we could relax the SPDE? Changing sigma triggers overfitting, but change in range does not produce any effect. Hmm..
I thought about those individual pixels, but those appear simply because a zero gets one score, and 1 gets another score, and they occur in the region with generally low probability, so the ones get a high score. One should really only interpret the pairwise score differences (and their spatial averages/sums), and those were more mixed even in the previous version (I haven't checked your new version yet). Also, we're still doing in-sample assessment; out-of-sample would be preferable. Unfortunately CPO doesn't quite work for comparison here because the ZAP/Hurdle model has a mix of single and pairs of observations that would need to be left out. Simpler would be to do a training/test data split, using some data only for estimation and the rest of the data only for predictions. Also, computing separate scores for prediction of $I(y_i > 0)$ and for $y_i|y_i>0$ would help; I think it will then be clear that the ZP/Hurdle model predicts $I(y_i>0)$ better.
When I run the new model, I still get worse scores for ZAP than for the other two (across all four scores; they're all negatively oriented, and the ZAP scores are the largest, i.e. worst):
Model | MAE | MSE | MDS | MLG
-- | -- | -- | -- | --
Poisson | 0.2711175 | 0.6257786 | -2.191408 | 0.4124088
ZIP | 0.2918497 | 0.6871863 | -2.162391 | 0.4286283
ZAP | 0.3046396 | 0.7194702 | -2.157850 | 0.4477888
But like I said, these are in-sample scores. Should check out-of-sample scores.
I would like to initiate the vignette on ZIP and ZAP models. These are very useful in my field of work and I would appreciate learning more about them through the contribution of others.