Open hhp94 opened 3 weeks ago
Thanks for the report. This is one of those unintentional things; I didn't know the ocat()
had an $rd
component. These non-standard families need specialist support and that's something I hadn't gotten to yet (my recollection was that none of them had $rd in the family).
This is a bug (I should probably check to see if a particular family is supported and throw an error for those that don't work). And yes, the fix will be to pass in a linear predictor value for each observation we're simulating for. I don't think the fix is that trivial however; I will need to handle all the other non-standard families and some of those will require different handling to that needed for ocat()
. I have begun to tackle this in fitted_values()
, where I have a list of standard families that just work normally and then other families that need special handling through a pre- or post-processing step. This is how I'm handling ocat()
in fitted_values()
for example. I'll need to think how best to do this in simulate()
and also try to not duplicate code as the details of all these parameterisations for all the families makes my head hurt from time to time.
Oh, I’d be happy to help! To be honest, I'm still getting familiar with posterior simulations on a theoretical level, so working with your implementations in gratia is a great opportunity for me to learn. Nevertheless, I'll go through your vignette more carefully, review the fitted_values
code, and see what might be the best approach. In the meantime, if you have an idea and would like help implementing it, just let me know!
In simulate-methods.R in gratia, for
ocat
models,predict(type = "response")
currently returns the probability of each category. However, thefix.family.rd
function expects a vector of linear predictor values instead. This discrepancy leads to incorrect handling in thesimulation.gam
output forocat
.I've pasted the
mgcv::fix.family.rd
function for theocat
family here for reference:Reproducible Example
In this example,
sim
is a $4 \times 1$ matrix becausefix.family.rd
interpreted the output ofpredict(type = "response")
, which isPr(y = 1, 2, 3, 4)
, as linear predictor values, leading to unexpected results. I'm fairly sure this is a bug. If you agree, let me know how you'd like to fix it and I can open a PR. Shouldpredict(type = "link")
be used insimulate.gam
instead?Thank you so much for teaching me about
gam
and{mgcv}
.