Serial intervals smaller than the shift-parameter are dropped from the calculation

pdehaye commented 4 years ago

A new paper was just put up on the Arxiv, with title COVID-19 infectivity profile correction and abstract:

The infectivity profile of an individual with COVID-19 is attributed to the paper Temporal dynamics in viral shedding and transmissibility of COVID-19 by He et al., published in Nature Medicine in April 2020. However, the analysis within this paper contains a mistake such that the published infectivity profile is incorrect and the conclusion that infectiousness begins 2.3 days before symptom onset is no longer supported. In this document we discuss the error and compute the correct infectivity profile. We also establish confidence intervals on this profile, quantify the difference between the published and the corrected profiles, and discuss an issue of normalisation when fitting serial interval data. This infectivity profile plays a central role in policy and decision making, thus it is crucial that this issue is corrected with the utmost urgency to prevent the propagation of this error into further studies and policies. We hope that this preprint will reach all researchers and policy makers who are using the incorrect infectivity profile to inform their work.

The offending line is here: https://github.com/ehylau/COVID-19/blob/d489d015d973fb578fec595abdfa2fdf2849838d/Fig1c_Rscript.R#L67

mhoehle commented 4 years ago

After reading the Arxiv report, I think the title of this issue is not completely describing the problem.

The concern is that the paper uses a left-shifted gamma with support on [-c, \infty], but if the starting value of c is smaller than some of the observed values for the serial interval, then the above quoted offending line of code results in those having no contribution to the likelihood. Since the start value is 2.5, but 2 observations have a value smaller than this. So maybe change title to: "Serial intervals smaller than the shift-parameter are dropped from the calculation"?

Regarding the content, it might be prudent to change the starting value of the optimization in

inf.fit = optim(
  c(2, 0.5, 2.5), lli.fx, 
  x.lb = data[, "x.lb"], 
  x.ub = data[, "x.ub"],
  y    = data[, "y"],
  lnpar = c(ln.par1, ln.par2)
)

from 2.5 to something larger. Values of 7 did not work for me, but a value of 4 resulted in a infectiousness profile similar to the arxiv report.

Will be interesting to hear what the code owner & authors of the paper have to say about this. E.g. such observed negative serial intervals (-3 and -4 days for observations 54 and 68) might be a bit suspicious in the first place... Maybe the order of infection was the other way around for the respective infector-infectee pairs? What might also be a factor is if the two reported very different symptoms, e.g. infector something like loss of taste or shortness of breath and 2nd individual something lighter such as headache? Basically, reported day of symptom onset for COVID-19 appears somewhat subjective and might depend quite a bit on how you ask...

pdehaye commented 4 years ago

I am not an epidemiologist, but have heard of similar cases in South Korea.

mhoehle commented 4 years ago

Unfortunately, there is no answer to this issue by the code owners yet. However, it appears a new version of the Fig 1(c) script was uploaded as part of b24b01e. The relevant code block was changed to:

fit = optim(
  c(10, 0.5, 20), lli.fx, 
  method = "L-BFGS-B",
  lower  = 1e-3 + c(1,0,4),
  x.lb = data[, "x.lb"], 
  x.ub = data[, "x.ub"],
  y    = data[, "y"],
  lnpar = c(ln.par1, ln.par2)
)

When running the updated script the following figure appears:

grafik

The infectiousness profile (middle figure) is now more to the left as previously, i.e. the -3 and -4 observation are now taken into account. Would be good, if the code owners could comment this.

ehylau / COVID-19

Serial intervals smaller than the shift-parameter are dropped from the calculation #2