HopkinsIDD / cholera-mapping-pipeline

Formerly part of cholera-taxonomy. The map creation scripts, packages, and file structure
1 stars 3 forks source link

SDN 2016-20 #436

Open eclee25 opened 1 year ago

QLLZ commented 1 year ago

Data pull: HASH: 6772ea9da4f43ddc3da11c494a573af2ca3dd3c8 config

QLLZ commented 1 year ago

Model run: HASH: 6772ea9da4f43ddc3da11c494a573af2ca3dd3c8

QLLZ commented 1 year ago

After double checking, SDN2016-2020 config was missing before.need to regenerate config and rerun from data pull

QLLZ commented 1 year ago

new config

QLLZ commented 1 year ago

rerun from data pull: data pull on dev: HASH c9a89d4ac5e4f63cb75fa5c006a900ece2839199

QLLZ commented 1 year ago

Model run: HASH 6772ea9da4f43ddc3da11c494a573af2ca3dd3c8

QLLZ commented 1 year ago

Country data report

eclee25 commented 1 year ago

Convergence, fits, Rhats look good, but there is a strong effect of the adjacency matrix on the rate maps that seems unusual.

Opinion: Investigate adjacency matrix

javierps commented 1 year ago

Maybe same issue on population density as in NER. Large and sparsely populated places are hard for the model to fit?

javierps commented 1 year ago

Dagar fixed should solve this, waiting for ID 45 report.

QLLZ commented 1 year ago

Config

Data pull:

HASH: 211913e0eb7d570ed62c48bc1ae9ecd6671a6cbc

QLLZ commented 1 year ago

Failed stan model run log file

QLLZ commented 1 year ago

Rerun on dev_u_combs_fix

HASH: e56580fa5ddab00293e31ff90139351f80ae3f6c

QLLZ commented 1 year ago

country data report

QLLZ commented 1 year ago

Overall cases are overestimated. convergency looks good/

eclee25 commented 1 year ago

Instability in sd_w traceplots but otherwise the model convergence looks good. There is overestimation in 2016 and 2019 and it's not clear why. Perhaps the 2016 overestimation is related to the multi-year observation from 2016-2018?

Investigate the 3-year multi-year obs in OCs 20189? 21107? It's hard to tell which OCs they are in based on the cdr -- may be better to verify which OCs have these observations in the stan input file.

I think we could perhaps try running it after removing or handling the 3-year multi-year obs in a different way?

QLLZ commented 1 year ago

double checked the 3-year observation in 21089. it looks correct based on the source doc

javierps commented 1 year ago

The 2016 data is an imputed observation, so may make sense it is hard for the model to fit.

For std_dev_w, given the very high values of rho I don't think this has a large impact (the w traceplots look very good).

I don't think the multi-year observations are breaking things here, but could try a run without it.

eclee25 commented 1 year ago

Rerun with updated national od parameter

eclee25 commented 1 year ago

@javierps determined that the initialization failures we're having with the new od admin0 parameter are caused by numerical stability issues in the evaluation of the censored observation probability -- added a possible fix to the Stan code that improves numerical stability of the probability is close to 0 or 1 --> implemented as dummy censoring variables

javierps commented 1 year ago

Sep 2023 Production run: chain differences in std_w, rest convergence Ok, Rhats Ok.

Suggestion: accept.

eclee25 commented 1 year ago

Noting that Tab 6 2017 estimates are between the two observations and the PI does not include either obs, likely due to the slightly restricted overdispersion param here. Other diags look okay.

Discuss this but leaning to Approve. We are okay with the 2017 estimate but we want to rerun the cdr to show the 2016 imputed observations.

Action Required - rerun cdr

eclee25 commented 1 year ago

2016 imputed obs is ok. Approve