Support reporting triangle like data

seabbs commented 2 weeks ago

This would be data that is decomposed into report and reference dates. Current functionality would then be a special case of this.

It would need.

[ ] An additional delay spline in the model to model the delay
[ ] Support for passing in a offset to represent a externally estimated delay (I think of this as a prior for the spline).
[ ] A stretch goal would be support for time and/or region varying additional delay layers
[ ] Posterior prediction tooling that sums up available data with predictions from the reporting triangle to give a nowcast.

This functionality would be similar to that in

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10538717/#:~:text=In%20May%202022%2C%20cases%20of,travel%20links%20to%20endemic%20regions.

and

https://www.medrxiv.org/content/10.1101/2024.07.19.24310696v1

by @OvertonC and @jonathonmellor and co. Both of these have code and synthetic testing data that have open licences I believe.

zsusswein commented 2 weeks ago

This is definitely in scope for v0.3.0. I'd like to do it sooner, but I think partial pooling is likely higher priority. A few follow-up qs:

report and reference dates

Do you think it's cleaner to pass delay day & reference date or report date and reference date? Two sides of the same coin...

Support for passing in a offset to represent a externally estimated delay (I think of this as a prior for the spline).

This is in addition to the delay spline in the same model fit or as an an alternative version that takes just one vintage and does a nowcast correction to the last $D$ points?

A stretch goal would be support for time and/or region varying additional delay layers

I'd assume region-varying independent fits is required? Are you suggesting like

y ~ 1 + s(reference_date, bs = 'ad') + s(reference_date, by = group, bs = 'sz') + s(delay, by = group)

And then something clever I need to look into more with a ti(reference_date, delay) term. I don't know how ti() handles discrete levels, so I need to figure that out before I propose something with it.

jonathonmellor commented 2 weeks ago

Code can be found here: https://github.com/jonathonmellor/norovirus-nowcast/tree/main This example is national only, but in practice for other incidents we do a lot of partial pooling across areas (a bit outbreak dependent).

If I could "do it all again" in my mgcv work I would have switched to the approximate GP (or RWs) a lot earlier rather than tp or cr splines for nicer extrapolation properties, but that may not be in scope for this work!

zsusswein commented 2 weeks ago

Thank you @jonathonmellor! I was reading your manuscript yesterday, but haven't looked at the code yet. I'm excited to check it out.

I would have switched to the approximate GP (or RWs)

You mean epinowcast here? Or bs = 'gp'? If you have some intuition about what might be more performant, I think scope is flexible.

jonathonmellor commented 2 weeks ago

I'll put in an intro chat at some point as well!

I mean in the mgcv case, so yes bs='gp'. A big part of my Omicron time was fiddling with different knot numbers for either growth rate models or forecast models - the tp and cr splines are quite sensitive at the end of the known data to the choice (which I'm sure you're aware of having had a peak at the code here!).

The cr and tp are poorly behaved when extrapolating far from the known data, and even at the edge of the known data you can get some quite varied results depending on choice of k. It was particularly painful for weekly updates to models when there are day-of-week reporting effects. In our nowcasting example, we are doing partially "extrapolation" and partially data informed smoothing (with the data triangle), so I expect this instability effect to be present.

I don't think there's a nice answer to that problem, but we largely avoid it now by using a bs=gp instead and a sensible characteristic length-scale, which seems more stable, and give more sensible uncertainty.

Sorry bit of a info dump there, am just really pleased to see more epi GAM work

jonathonmellor commented 2 weeks ago

While I'm on a roll, worth checking out: https://github.com/eric-pedersen/MRFtools/issues/10

For how to implement a RW1 in mgcv, which has been fun to play around with + good for baseline models.

Will add a warning that mrf type objects are a pain when packaging or working in parallel with mgcv

zsusswein commented 2 weeks ago

I'll put in an intro chat at some point as well!

That would be great!

I don't think there's a nice answer to that problem, but we largely avoid it now by using a bs=gp instead and a sensible characteristic length-scale, which seems more stable, and give more sensible uncertainty.

I'll make sure this is high-ish on the list. I want to get some minimal usable thing before getting to work on performance, but I'm hoping get that moving here soon!

Sorry bit of a info dump there, am just really pleased to see more epi GAM work

I hope we can make this useful! It's really been your success with GAMs that got us interested (as I hope the readme makes clear). We're still in very early days, but the dream would be to eventually mature this into something even the experts like you all would find helpful in some capacity.

For how to implement a RW1 in mgcv, which has been fun to play around with + good for baseline models.

I remember @seabbs also sharing an interesting thing from that world on time-varying day-of-week effects. I'll go take a look, thanks!

kgostic commented 1 week ago

Late to the party here, but I'd love to talk more at some point about this gp vs spline discussion. In our EpiNow2 modeling, the gp has been the main source of convergence issues and weird fits, so I had been hoping to drop it. We've mostly been able to get away with this by setting the sampler parameters to be really conservative, so the models do usually converge, but occasionally we see:

failure to converge (high Rhat, lots of divergent transitions, or clustered divergent transitions) -- this is very rare, but hard to deal with when it happens
convergence, but with what appears to be a bimodal posterior for the gp length scale, which in turn makes the prediction interval around Rt oddly shaped, and asymmetrical about the median, (unsurprising given overlapping waves with different periods)

It sounds like the root issue is the same whether we're working with splines or a gp: the level of smoothness in the data changes over time, and sometimes we'd need an adaptive smoother to match the reported patterns.

Would love to chat and compare notes on how best to deal with this in practice!

jonathonmellor commented 1 week ago

Keen to discuss! Have pinged an email over.

Part of the challenge might be around what's the purpose of the smoother, is it only smoothing or are we relying on it for extrapolation.

Perhaps some of the challenges relate to the length-scale being estimated rather than being a fixed parameter, though that might be mgcv's approximation hiding some problems...

But yes fully agree, adaptive smoothers would be really interesting to explore.

seabbs commented 1 week ago

Do you think it's cleaner to pass delay day & reference date or report date and reference date? Two sides of the same coin

I don't have a strong preference though dates are often easier for the user and indexes are often easier for the dev.

This is in addition to the delay spline in the same model fit

In addition. I think it may act like a prior and enable different delay dists per strata but the same residual spline to be shared which I can see being performant.

Are you suggesting like

Yes

we largely avoid it now by using a bs=gp instead and a sensible characteristic length-scale, which seems more stable, and give more sensible uncertainty.

I agree though haven't really played with the mgcv implementation. I also like the rw implementation from https://github.com/eric-pedersen/MRFtools/issues/10. Coming in via the penalty matrix suggests it may be possible to parameterise other model options like ARs here as well which would be appealing (especially without having to extend into stan like mvgam currently does I believe).

n interesting thing from that world on time-varying day-of-week effects

Do you remember where that was @zsusswein. It seems really promising so worth putting into an issue/on peoples radar.

the level of smoothness in the data changes over time, and sometimes we'd need an adaptive smoother to match the reported patterns.

Agree.

Part of the challenge might be around what's the purpose of the smoother, is it only smoothing or are we relying on it for extrapolation.

and yes fixing the lengthscale (or places a strong prior on it) prevents you seeing issues but at the cost of maybe not learning when things are different that you expect.

zsusswein commented 1 week ago

It was this one! Super fun decomposition and I think I missed on my first read that it requires mvgam to work because of the monotonic constraint on the spline

CDCgov / cfa-gam-rt

Support reporting triangle like data #32