Open zsusswein opened 4 months ago
In a world where you wanted to generalise this to more complex models then I think brms
support would be good but if we are happy shutting the door on that (without major reworking) i think its fine it make a statement that this is a mgcv extension package and work from there
I'm definitely not opposed in principle, but I think we'd need a clear vision of the scenarios we want to enable and what it means for functionality vis a vis EpiNow2/Epinowcast. If we go full bayes, why not implement it as a forward renewal model? If we do, how is this different than a re-implementation of EpiNow2? If we don't, we aren't we implementing the current state of the art?
Basically, I think we should have a clear idea of what we're doing here and when we want to hand things off to the Epinowcast/EpiNow2 universe. I think there could be a compelling case for joint delay estimation and forward convolution, which will be a major weakness here. I don't know where to draw the line though, and that makes me uneasy.
An additional thought here: it may be worth exploring gamm
and/or gamm4
. I imagine the many thousands of timeseries case will be challenging and computationally intensive and we could squeeze out some better performance with the lme4 backend vs the standard nlme.
I think I have a clearer vision for those use-cases (fast approximate models implemented with a focus on real-time performance) than I do for the brms case right now?
If we go full bayes, why not implement it as a forward renewal model? If we do, how is this different than a re-implementation of EpiNow2? If we don't, we aren't we implementing the current state of the art?
The difference would be in handling delays I think which is tricky but you still could. What you would not really be able to include would be all the uncertainty handling stuff so in that sense that would be the unique place EpiNow2
would occupy.
To me the line is everything up to and including the renewal process and leave everything beyond that to other tools at least for now.
I think the main motivation for brms
right now is:
The longer term use case is ability to extend to include more complexity or to allow the user to do so (as we do in epidist
)
All that being said I don't really think you should try and support brms
in any currently planned versions. Rather I think you should make design decisions that don't constrain you to just having mgcv
as a backend (which I think the current implementation does if you want the code to be at all clean). In some sense its not a problem as you could update with breaking changes but if wanting to avoid that then worth thinking about this all now
Chiming in:
I think the main goal here is to have something we could possibly pull off the shelf and/or share with partners, ideally before the end of the calendar year. I think it's important to focus on one thing that works, rather than generality and flexibility right now.
With that in mind...
Who would the additional backend be for? What's the use-case? What workflows would we try to support?
gamm4()
for the lme4 backend and maybe mgcv::ginla()
, as these are backends that might help with speed or convergence without much of a lift. From the `gamm' documentation, it's not clear to me that the lme backend would really help with speed or stability unless we stick entirely to identity link functions.How would they mechanically be implemented? The lift would be pretty light for an additional mgcv function but brms would likely take a bit more rejiggering. Is it enough to switch to an internal S3 class structure? Do we need to think more about a public model fitting function? I'm resistant to it, but would listen if we could open up some important missing functionality. Obviously the answer to (2) is downstream of (1).
I think we should skip brms. Do you still need to talk through implementation for the other two? If it's going to be a big lift there, I could even see sticking with gam
and bam
and calling it a day.
I think we should skip brms. Do you still need to talk through implementation for the other two?
To focus this issue I don't think it's about which backends to eventually include. I think it's about should the package be designed now to handle backends from different packages (with a shared interface) or just limit to mgcv.
For the reasons already given I think it's compelling to design now for multiple backends but not actually implement any of them any time soon
including for large counts.
I don't think the size of counts interacts with the current proposed model methodology so any backend should scale well here.****
convergence
I'm not sure they will help with this but they will offer other modelling options and maybe more stable than mgcv in some circumstances (noting stable != Converged or reliable imo)?
convergence
I'm not sure they will help with this but they will offer other modelling options and maybe more stable than mgcv in some circumstances (noting stable != Converged or reliable imo).
. I think it's important to focus on one thing that works, rather than generality and flexibility right now.
I think this is the conversation to have. If you want at any point to think about other package backends I think will save a lot of pain to do the tooling to allow that seamlessly now (otherwise many breaking changes . In most instances we are taking about things like allowing users to pass ... args or overwrite hard coded internal settings.
If you don't want to support beyond mgcv, which I think is a totally reasonable place to set the scope then as is will work quite well (with a little more user options imo). The only issue is changing from this position has the most negative impact of available choices once a lot of work has been done
For the question of which backends maybe they need their own issues and potential schedule. I do think there are very compelling reasons to think about brms here but also see the counter argument about scope creep.
Seee #34 and #35 for specific backend issues.
Currently we support
gam
andbam
. In my mind these are the big two. Potential additional options aremgcv::ginla()
orbrms::brm()
, but I don't have a clear vision for these yet.EDIT: Also
{mvgam}
The questions to resolve here are:
brms
would likely take a bit more rejiggering. Is it enough to switch to an internal S3 class structure? Do we need to think more about a public model fitting function? I'm resistant to it, but would listen if we could open up some important missing functionality. Obviously the answer to (2) is downstream of (1).mgcv::ginla()