epiverse-trace / cfr

R package to estimate disease severity and under-reporting in real-time, accounting for reporting delays in epidemic time-series
https://epiverse-trace.github.io/cfr/
Other
13 stars 3 forks source link

Difference in the time series of the delay-adjusted rolling Ebola results #152

Open adamkucharski opened 2 months ago

adamkucharski commented 2 months ago

From @avallecam

From the last update in {cfr}, I noticed a difference in the time series of the delay-adjusted rolling results. The new version is in the figure of the right.

As Pratik explained to me, this is because cfr_rolling() calls .estimate_severity() internally, and chooses an estimation method (Binomial likelihood, Poisson approx. or Normal approx.) based on the data for each day of the outbreak. This change is explained given that now we use a flexible approach instead of choosing a single method for the full run of cfr_rolling(), i.e., all days.

My question is on the interpretation of the sudden decays and the huge increases in uncertainty that overlap the naive estimate. Based on the figure interpretation and how Nishiura et al results are reported, I would expect to have a single method for the full run.

image (24)

adamkucharski commented 2 months ago

The update was to avoid using an overly slow method for large datasets or overlay approximate for small datasets. However, it does look like something odd happening here, especially because doesn’t converge to ‘true’ CFR at the end. On further investigation, it looks like the normal approximation is causing issues, as it probably not necessary anyway because the binomial calculation is optimised (e.g. using lchoose()).

Will put together a PR to fix.

Note that strictly speaking, this isn’t a single run in the plot (it’s multiple runs, as if we’d been repeating analysis in real-time as outbreak got bigger - so switching likelihood method is reasonable).

adamkucharski commented 2 months ago

Addressed with PR #153