reconhub / covid19hub

Community-driven COVID-19 analytics in R
58 stars 2 forks source link

Package: bed occupancy forecasting #3

Open thibautjombart opened 4 years ago

thibautjombart commented 4 years ago

Package: bed occupancy forecasting

Description

Package existing code to implement forecasting of daily bed occupancy from input on daily admissions and the distribution of duration of stay.

The package should include documentation, simple reproducible examples, unit tests.

Use cases / specs

  1. predict bed occupancy from existing admission data
    • input dates of admissions as Date
    • input numbers of admissions as integer
    • input distribution of the length of stay (LoS) as a distcrete object (preferably) or as a function returning integer numbers along the lines of rgeom()
  2. do the above, but with several admission trajectories (e.g. from different models)
    • input on admissions as a projections object
    • input LoS as in 1.
  3. outputs should be projections objects

Relevant related pacakges

Starting point

A starting point is provided in the two following gists:

Impact

Some of the main questions for COVID19 response relate to hospital bed capacity. For this, multpile modelling groups are making forecastings of bed occupancy, e.g. our app. However there is no consistent way of doing this, leaving room for potential errors and consistency across results. Having a reliable, computer-efficient, well documented and tested way of forecasting bed occupancy will improve the quality of, and the ease for providing, these predictions.

Proposed timeline

Focal point

@thibautjombart

TimTaylor commented 4 years ago

If this is still required I've started packaging it up over at https://github.com/tjtnew/bedoc. A little hectic today but can probably get something reasonable finished by c.o.p tomorrow if it's still required.

thibautjombart commented 4 years ago

Yes, this is still needed :) I'll remove my totally unrealistic deadline, but this is still very topical and timely. Many thanks for getting started on this!

TimTaylor commented 4 years ago

A few questions:

TimTaylor commented 4 years ago

I've thought some more in regards to the second and third bullets above. I think it would make sense to have simulate_occupancy as an internal function and only export bed_forecasts. The output of simulate_occupancy can be recovered from bed_forecasts by only inputting a single projection so I think this is the cleaner option. We can then also move the sanity checks in to bed_forecasts and just test that functionality. If you agree I'll modify later and then start on a more detailed use case for a vignette.

thibautjombart commented 4 years ago

Sorry about the slow response on this.

  • What are you trying to avoid with the sanity check n_admissions[1] >= 1. A user may want to illustrate the ramp up from zero cases over a particular interval. Perhaps a warning instead to highlight if they start with zero cases in case they didn't mean to?

That's a mistake, having it's origin in a technicality, from memory: the function was throwing an error if there was a day of zero admissions due to using rep (maybe?). At any rate, it should be all(n_admissions >= 1) rather than the current test. Note that the corner case where all data in n_admissions is 0 may have to be handled separately: in this case the model will not predict a single bed.

  • Are you wanting both simulate_occupancy and bed_forecasts to be exposed to the user?

I named these poorly. I think ideally this should perhaps be a generic with specific S3 methods for different classes of objects? And to be consistent with projections, maybe we could settle for project_beds(). What do you think?

  • If simulate_occupancy is supposed to be exposed to the user then the code restriction to only predictions dates <= the maximum provided admission date is a little confusing. The reason I say this is that "use case 1", "predict bed occupancy from existing admission data" would seem to make this redundant otherwise. Should we instead allow the user to specify the end date (but default to the maximum admission date).

Yes (to the last suggestion).

This point can be ignored if simulate_occupancy is only an internal function to be used only by bed_forecasts. Hope these q make sense!

It does. Thank you so much for your help - very appreciated!

thibautjombart commented 4 years ago

I've thought some more in regards to the second and third bullets above. I think it would make sense to have simulate_occupancy as an internal function and only export bed_forecasts. The output of simulate_occupancy can be recovered from bed_forecasts by only inputting a single projection so I think this is the cleaner option. We can then also move the sanity checks in to bed_forecasts and just test that functionality. If you agree I'll modify later and then start on a more detailed use case for a vignette.

OK, it would make sense, but maybe a middle group, using S3 generic / method, would be to provid a method for incidence objects too, so that users can:

An easy way to go about this would be adding a as_projections.incidence method (or as(..., "projections")) in projections to convert incidence objects to projections. This is what is done effectively at: https://gist.github.com/thibautjombart/7dacaee1e1ec3e5bf27e522d9312be76#file-simulate_occupancy-r-L87-L89

TimTaylor commented 4 years ago

OK, it would make sense, but maybe a middle group, using S3 generic / method, would be to provid a method for incidence objects too, so that users can:

* forecast bed occupancy from existing admission data (stored as `incidence`)

* forecast bed occupancy from projected admissions (stored as `projections`)

That sounds good.

In regards to a vignette is any of the data/projections used in your paper available in the open? I'm guessing not, but it would make a nice example if they were.

thibautjombart commented 4 years ago

Unfortunately no, no publicly available data for this specific example. But we can try to make a decent example using realistic epi parameters. I paste below code for a length-of-stay distribution publicly available, with some random admissions as incidence:


## Data from across 23 countries with “many of the included cases from the
## UK”. 290 patients, of which 163 with recorded outcomes. 52 admitted to
## ICU. Data collected on patients with data collection commenced on or before
## 13 March 2020. Report issued 27 March 2020.
## https://media.tghn.org/medialibrary/2020/03/ISARIC_Data_Platform_COVID-19_Report_27MAR20.zip

## Mean hospital LOS to death or discharge (n=163) 5.3 days (SD 4.25)
## Roughly: median 5 days, QIR: 3-7
los_isaric <- distcrete::distcrete("weibull", shape = 2, scale = 6, w = 0.5, interval = 1)

## make toy admission data
library(incidence)
library(magrittr)
admissions <- sample(Sys.Date() - c(0,7), 40, replace = TRUE) %>% 
  incidence()

## project bed occupancy for current admissions 

## project future admissions
[...]

## project bed occupancy for current + future admissions
[...]

Note that the missing bit, beyond the bed occupancy part, is a serial interval distribution (si in projections::project). This can be obtained by:

  1. getting published parameters e.g. from MIDAS
  2. converting, if needed, these parameters to build a distribution (things like epitrix::gamma_mucv2shapescale may help to convert parameters specified in the literature into something we can build a gamma with)
  3. using distcrete to build a discretised distribution

I will ask what serial interval is currently used for UK data.

TimTaylor commented 4 years ago

project_beds is now a generic with methods for projections and incidence objects. I've changed the code a bit from your initial gists so it's probably worth you kicking the tyres a bit.

I've included a few simple integrations tests and we have good coverage save for the sanity checks which we could easily check although perhaps a little unnecessary.

CI is set up with GitHub Actions which checks the build on mac (release + devel), windows (release) and linux (release). It also builds the basic pkgdown site and integrates coverage with codecov. Currently these run as separate jobs so this could be streamlined somewhat.

From your last comment I take it you want to implement a predict_admissions function. Is it worth me starting with a simple exponential growth method to begin with (the doubling approach in the app) and we can build on this going forward.

Can we move future discussion over to https://github.com/tjtnew/bedoc/issues as it will be good to keep the discussion tied to the repo. On a similar note let me know if you want to shift the repo over to Reconhub and for us continue development there.

Have a good weekend.

thibautjombart commented 4 years ago

Impressive! Great, I will take the conversation there.

TimTaylor commented 3 years ago

Package now at https://github.com/reconhub/occupancy