Already managed full pidp retention through data generation pipeline (see #444) but Minos currently initialises but doesn't complete a year of run. Next steps:
[x] Running with US (i.e. non-synthpop) input pop...
[x] Current issue: KeyError in mortality lookup due to missing values of sex. - > fixed in #458 (missing-types imputed/replaced with NaN during imputation; KeyError was due to -9 values of sex in mort/fert lookups)
[x] Account for missingness, either via naive imputation by hierarchical sampling from existing data (already done outside Minos, must copy over) or via MICE (already on child_poverty_May24). separated into #457
[x] Still some missingness in imputed data -> partly fixed in e7972aa for sex and ethnicity but not for region (11/29271 = 0.04% for 2020 data) -> marking as done and moving to new issue (#459) as this is of more general interest for other variables with missingness; can continue here with the very small level of missingness
[x] Running with full synthpop (or 1% of it)...
[x] Update to full GB synthpop (currently Scotland only) -> done in #461
[x] Refactor to select only necessary variables before merging US with SP, to reduce size -> done in #461 in US_individual_upscaling, multiple commits
[x] Generate synthpop from fully retained US population, this must include composite vars from Minos, which was another motivation for getting full retention through data generation. (Already done outside Minos, must copy over and test.) -> as above
[x] ~Generate additional years of synthpop input (e.g. 2019-2021, as in IE (II))~ -> moved to #464 as want to draw a line here
[x] Check correct years being selected in Minos/synthpop (i.e. go through year naming conventions used) - > done in #461, various places, e.g. gb_scaled config files (2019, rather than 2020), DG makefiles, US_individual_upscaling, generate_repl_pop; not complete though, as (e.g.) rate tables and external data for validation are still off by a year -> moving to #464 to complete
[x] Allow for specification of synthpop as input population -> done in #461 in various places
[x] Ensure latest LSOA, ward, LA and region definitions present -> done in #461, adapted IE code to US_individual_upscaling.add_spatial_attributes, not used anywhere yet but will be required for validation -> carried over to #464 so as not to forget to test
Already managed full
pidp
retention through data generation pipeline (see #444) but Minos currently initialises but doesn't complete a year of run. Next steps:KeyError
in mortality lookup due to missing values of sex. - > fixed in #458 (missing-types
imputed/replaced with NaN during imputation;KeyError
was due to-9
values ofsex
in mort/fert lookups)Account for missingness, either via naive imputation by hierarchical sampling from existing data (already done outside Minos, must copy over) or via MICE (already onseparated into #457child_poverty_May24
).sex
andethnicity
but not forregion
(11/29271 = 0.04% for 2020 data) -> marking as done and moving to new issue (#459) as this is of more general interest for other variables with missingness; can continue here with the very small level of missingnessUS_individual_upscaling
, multiple commitsgb_scaled
config files (2019, rather than 2020), DG makefiles,US_individual_upscaling
,generate_repl_pop
; not complete though, as (e.g.) rate tables and external data for validation are still off by a year -> moving to #464 to completeUS_individual_upscaling.add_spatial_attributes
, not used anywhere yet but will be required for validation -> carried over to #464 so as not to forget to test