Closed avallecam closed 6 months ago
I don't think removal of unknowns is appropriate if we're calculating CFRs based on incidence data (with the linelist data only processed as a step to generate this incidence). The Ghani et al paper is based on individual outcomes via survival analysis approaches, so outcome type does matter (see also issue #7), whereas the incidence-based analysis adjusts for as-yet-unknown outcomes in real time, so removing these unknowns would probably bias estimates (as it seems to in the estimate_static()
calculation above)
Carmen and I put together some case studies that hopefully illustrate the linelist estimation vs incidence estimation issues more clearly: https://github.com/CarmenTamayo/Applications-Epiverse-pipelines/blob/ak-edits/Marburg_underreporting.Rmd
Thanks @avallecam and @adamkucharski - just to clarify, is this a feature we need to add in some way? We do already allow replacing NA
with zeros in prepare_data.incidence2()
. My understanding is we're currently okay as things are?
Closing this as {cfr} is not intended to work with linelist data. Users can/should convert their linelists to incidence data before using it with {cfr}.
Following Ghani, 2005 and Lipsitch, 2015, to estimate CFR from linelist data it's suggested to include deaths and recoveries, but exclude all unknown outcomes.
This step still needs to exclude all unknown outcomes:
https://github.com/epiverse-trace/cfr/blob/66bfa793e6d2a6f51f8de5a1ad6056b74c54350a/vignettes/data_from_incidence2.Rmd#L70-L76
It is solved by adding one more filter to this step
I agree to make this drop and keep steps explicit for linelist data at this step.
Here is a reprex to compare against the current vignette output.
Created on 2023-08-22 with reprex v2.0.2