karimn / covid-19-transmission

0 stars 0 forks source link

Investigating France divergences #28

Open karimn opened 4 years ago

karimn commented 4 years ago

Run high adapt delta to see if we can eliminate all divergent transitions.

Job 64395309 problem countries: DO (singleton), SE, ES, MY, NL, FR (high divergence but Rhat < 1.01)

karimn commented 4 years ago

Looking at DO, this looks like the ifr_noise problem that we tried to address in #23 and #21. See report here mobility_report.pdf. The model does appear to get stuck low ifr_noise = exp(-0.6) = 0.5 or so.

Let's try with an even tighter prior around log(ifr_noise) ~ N(0, 0.025)

karimn commented 4 years ago

Still need to address non-singleton countries. New job 64426766: FR, GB

karimn commented 4 years ago

FR report from latest run mobility_report.pdf

@wwiecek what do you think of this report? I'm not able to find any reason for divergence (so far) and the Rhat and ESS are good for FR. The R0s seem to be off for some sub-regions like Corsica. Notice also that S_t drops to zero.

wwiecek commented 4 years ago

R0's are off for every region. The underlying problem is that the model thinks that N infections was huge.

Some of this nonsense behaviour may be related to not having any cases before May? Some data glitch. Let's try fixing that and re-running.

If that does not help, try fixing IFR to .013. I don't know why (maybe there is a problem with the model code?) but the predicted deaths are very variable even though the numbers infected are not. Look here:

image

Huge "upper" tail above the median, even though the model decided that everyone got infected. If that was the case, you should have a narrow number interval for expected deaths, basically IFR*(population size).

wwiecek commented 4 years ago

Looking at DO, this looks like the ifr_noise problem that we tried to address in #23 and #21. See report here mobility_report.pdf. The model does appear to get stuck low ifr_noise = exp(-0.6) = 0.5 or so.

Let's try with an even tighter prior around log(ifr_noise) ~ N(0, 0.025)

IFR of about 0.3% to 0.5% as in your PDF output is sensible. The S(t) going into negative values is concerning. Maybe worth investigating why tehre is a such a huge leap in cases on one particular day?

karimn commented 4 years ago

Further tightening of IFR fixed the DO problem for now. The problem with IFR for that model is visible in the pairs plot at the end of the report.

@wwiecek I'm not following what you're suggesting. What data glitch are you referring to? Something with number of cases? The model doesn't use data on cases.

Yes, my go-to should always to fix IFR and see what happens.

wwiecek commented 4 years ago

YOu do use cases to set t*, no? (And we still need to change it, BTW!) And the case data looks bad:

image

This results in nonsensical behaviour of the logistic adjustment:

image

This should not have much of an impact on what happens to R early on, but worth fixing first before trying other things. It's weird.

karimn commented 4 years ago

I see what you mean now. Ok, I'll create a issue and assign to Junyi to follow up on the data.

On Thu, Jul 16, 2020 at 10:16 AM Witold Wiecek notifications@github.com wrote:

YOu do use cases to set t*, no? (And we still need to change it, BTW!) And the case data looks bad:

[image: image] https://user-images.githubusercontent.com/20645798/87681884-15437700-c777-11ea-877e-a39f44b4cd2e.png

This results in nonsensical behaviour of the logistic adjustment:

[image: image] https://user-images.githubusercontent.com/20645798/87681946-2ab8a100-c777-11ea-9677-4e83e84ea13c.png

This should not have much of an impact on what happens to R early on, but worth fixing first before trying other things. It's weird.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/karimn/covid-19-transmission/issues/28#issuecomment-659441843, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABB552A4BPV5RNNOFILFMLR34DUNANCNFSM4O2T7A3Q .

karimn commented 4 years ago

Waiting on #31

wwiecek commented 4 years ago

For now we can re-run with t* set to e.g. 10 March? (That's about when France had influx of cases and started closing things.)

Not essential to have France in our results, but we hypothesised that something else may be causing problems, so may want to check.

karimn commented 4 years ago

Alright, I'll add a option to hardcode t*, but we need to have systematic way as you mentioned in #31 to do this. I have no prior beliefs/expertise in this so I'll wait for your input on this.

karimn commented 4 years ago

Hardcoding t* causes the model to time out; some chains take a lot longer to finish, never a good sign. I need to re-run once the cluster server is up tomorrow.

wwiecek commented 4 years ago

Hardcoding t* causes the model to time out; some chains take a lot longer to finish, never a good sign. I need to re-run once the cluster server is up tomorrow.

Was the problem specifically when re-running France? Just to be clear, t* is already "hardcoded" in the model because it's a single date that's passed to the model, no? The difference is that typically you determine it from data and here we put in a subjective value.

karimn commented 4 years ago

This is specific to France using the hardcoded March 10 that you suggested in this issue (see above).

wwiecek commented 4 years ago

So I'd infer the problem is not with t, the problem is with fitting French data, period, right? But when I looked at the report, the contact rates/t were the only thing that was obviously off. I think your go-to solution was fixing IFR, but maybe you've done that already? (FOr the record, and this is neither here nor there, but I do not like the fixing IFR thing because I don't understand why our tight prior is not good enough. Maybe the answer is that this model is so far from observed data, that it goes into really crazily low likelihoods, so sometimes IFR will want to land in a region that wouldn't really happen with "good" models