aakhmetz / COVID19-Replication-He-et-al-2020

0 stars 0 forks source link

Clarification question #1

Closed LucyMcGowan closed 4 years ago

LucyMcGowan commented 4 years ago

This is so great, thank you for providing your code! Your estimate of the shift (–3.7 days) seems closer to that proposed by Ashcroft (-4) than He (-2.3) or am I misunderstanding?

aakhmetz commented 4 years ago

Thank you for your positive feedback!

If I read Ashcroft et al correctly, their estimate of the shift is –25.6 days (Table 1). So I do not know where the estimate of Ascroft et al of –4 days comes from.

In this sense, both –3.7 or original –2.3 make more sense in my opinion. For example, the mean incubation period is about 5 days, so one may expect that the mean shift would be also somewhere between 0 and 5 days.

LucyMcGowan commented 4 years ago

Ah I think I’m confusing the values, my understanding was -2.3 from He corresponded to the beginning of infectiousness prior to symptom onset (which led to the suggestion of defining a “contact” as those who were in contact 2-3 days prior to symptom onset). I was thinking your estimate of -3.7 was for this value, which was more in line with Ashcroft’s -4, but maybe I’m mistaken?

For context: The difference between the initially reported -2.3 and -4 was the main result I was concerned with in Ashcroft's paper, not so much the actual distribution they generated, that seems implausible from a disease dynamics perspective, but also I think the parameters are somewhat inter-related, so there is potentially another set of 3 that would be a bit more plausible but still have the same effect in the sense that it estimates a different result for the beginning of infectiousness prior to symptom onset. Your demonstration that when including the 2 left out data points you ended up with -3.7 confirmed this for me a bit, but it's possible I've misunderstood! Obviously it is also somewhat concerning that 2 data points could have such an impact, I was happy to also see the added data which seemed to land in the middle ~3.2. From a practical perspective, thinking about whether contact tracing programs need to be looking back 3 days (or 4) instead of just 2.

aakhmetz commented 4 years ago

Dear @LucyMcGowan. My notion above was I have not seen where Ashcroft et al. obtained -4 anywhere - so for me that value was out of nowhere.

Regarding your comment, I think you are right and it would great to have more certainty about that. Definitely, there could be some correlation b/w onset of infectionsness and length of incubation period. In my opinion, this deserves another study rather than a correction of already published one. In my opinion, the authors of NatMed paper did their best with the data they had.

However, I have also checked the data sources, and two problematic pairs were from China. This should be regarded with caution because many reports from China disregard the possibility of community exposure (i.e. if someone travel from Wuhan to somewhere else, this was considered as the only one possibility for infection).

The first problematic pair #68 comes from Shaanxi:

Screenshot 2020-07-17 13 03 48

at that time there were about 20 cases reported per day. The reported cases were mainly moderate or severe.

I could expect that maybe the male 33y.o. could be exposed somewhere else, and not only on that dinner with the male 40y.o. There is no additional information, so I would classify that pair as probable.

Another one #74 comes from Shenzhen:

Screenshot 2020-07-17 13 15 24

(unfortunatly the link is not accessible anymore)

To conclude, the two extreme datapoints could be classified more as probable, but definitely they should not outweight all other 75 data points.

Of note, I found nice reviews of such issues from Furukawa et al (see Table 1 therein) and for example a report from UK govt.

LucyMcGowan commented 4 years ago

thank you SO much for taking the time to respond, this is very helpful! I pulled the -4 from this line:

Thus the published profile overestimates the efficacy of contact tracing, while the corrected distribution tells us we need to look back at least 4 days to catch 90% of presymptomatic infections.

From your Weibull (1.685, 4.513) with a -3.7 shift, I think we end up with ~13% of the mass less than -2.3, and among the area less than 0 (so among presymptomatic) looking only to -2 only captures 66% of the mass, which I think is meaningful for contact tracing programs that used the initial study as evidence to only ever look back 2 days.

Remaking Ashcroft's table 2 with your distribution:

time (days) He Ashcroft Weibull (1.685, 4.513) with a -3.7 shift
1 50% 33% 33%
2 98% 61% 66%
3 100% 80% 91%
4 100% 91% 100%

Yes, I would definitely support further study, I am (personally) not as worried about forcing a correction either, I just want our tracing programs to have the best information possible! Thank you, again this repository is very helpful!