Open paddy-r opened 9 months ago
just to butt in this should be doable from the config file by selecting the choice of input rate table data for fertility. I've done this before coarsely for switching between rate tables for mortality at the LSOA/LA level.
another side point may be to check the caching works as its been slow for me lately (may just be parity adds a lot of rows though)
another possile speed improvement is to drop the interpolation and stick with integer age rates since we have a yearly cross section model anyway.
@RobertClay all good, thanks, and will discuss more when I've made some progress. Also should roll in some points from #275, meant to get most of that done by now.
Required for fertility model write-up, i.e. to see effect of including parity in model. Want to have option of removing parity completely so we can compare runs with and without parity. Coding jobs from my email, 25/08/23:
convert_rate_data
. -> done in c08fc97 but could be simplified even furtherThink about adding try-except block to-> marking as done as unnecessaryBaseHandler.cache
for case ofrate_table_path
not being defined.transform_rate_table
to be more Pandas-based, as quicker. -> done in f06d495FertilityRateTable.add_parity
, as could be less Python and more Pandas; currently takes a minute or two, should be faster. (Mentioned by Rob in comments below.)~ crossing this off as not a priority and rate table production should only ever be done once, then cachednkids_ind
) with specific comparison of with/without parity.~ wiping as vague, purpose not clearAlso would ideally like to get resolved #275 and #291 along the way.-> both done.Update June 2024.
Transition from ONS data (1934-2020, England and Wales only) to Human Fertility Database (all UK, various year ranges depending on specific data but generally up to 2020; see here).-> ignoring for now as would have negligible effect on resultsUpdate 09/10/24.
Consolidating outstanding jobs here, and crossing off some non-priority jobs.
nnewborn
for new births (currentlypreg
/nkids_ind_new
) as (a) imputed and very low missingness, (b) more flexible as includes n > 1 new children, and (b) already used within metrics elsewhere, to be pulled into this branch.has_newborn
withnnewborn
everywhere by way of harmonisation (i.e. same variable in DG and at runtime).