Closed paddy-r closed 1 year ago
will come up with some answers for thursday.
will come up with some answers for thursday.
Thanks, I'll try and get a load of the jobs done in the meantime.
(1) Clarify difference b/t key_columns and parameter_columns in add_new_birth_cohorts.setup
Another very undocumented part of vivarium.. Its an interpolated lookup table. Make sure you understand lookup tables and linear interpolation before you read this. key_columns are the look up variables. E.g. for key_columns = [region, sex, ethnicity] it will find the rows in the lookup table with those values like [East Midlands, F, BAN]. There can be more than one row here.
parameter_columns = [age, time] is more complicated and uses linear interpolated lookup (order 0 I think?). For an observation in the population you can have continuous age and year timestamp e.g. [age, year] = [51.1245, 2012.12412]. The problem is how to estimate fertility rate given we have discrete values in the lookup table at age 51/52 and years 2012/2013. In the lookup table age_specific_fertility_rate we provide vivarium 4 columns age_start, age_end, year_start, year_end. Specifying parameter_columns age and time tells vivarium that observations on these values will be continuous data and which columns to use for start and end points of linear interpolation. This is probably better demonstrated with a diagram. Happy to discuss more.
(2) Which is current, FertilityAgeSpecificRates or nkidsFertilityAgeSpecificRates? (Presumably the latter.)
The latter.
(3) Currently data_generation.convert_rate_table, which generates rate table file, not called anywhere. How about calling during installation/setup to ensure regional output files present? Or during fertility module initialisation?
It should be called somewhere yes. Are you sure its not in the fertility pre_setup function .set_rate_table()? I believe they're cached as they can be quite expensive to generate particularly if youre adding more data in.
(4) Should mean be weighted (by population? NewEthPop data exist) in collapse_LAD_to_region (also, collapse_location)?
Not sure. I did this very roughly and not sure if there are suitable weights available. One to discuss on video I think.
(5) Are LA and region definitions current? If not, can use code from Inclusive Economy to generate lookups
I believe they're 2019? I had to manually adjust some areas (northamptonshire/gloucestershire?) that changed their boundaries recently. Your IE code will be better.
(6) Why LAs used in BaseHandler.compute_migration_rates (presumably for migration modules?) but regions used in fertility module?
We don't use migration in MINOS. Its from the old model Daedalus that does use LA level data. I'd say ignore it for now but Nik would probably love you if you did some maintainence on daedalus too.
(7) Think about how to generalise output/logging functionality in RunPipeline (and already a comment about it there) as very useful for me during fertility module development (cf. https://github.com/Leeds-MRG/Minos/issues/167) -> already partly addressed in job below -> create new issue if good idea to add more detailed functionality
Lukes done a lot of logging. Id suggest talking to him but python logging module is usually pretty clear and easy to add to. The more the merrier.
(8) How/where to generate/view specific variables during simulation, in the first instance fertility rate and year for which data is sought and year for which data are available?
Pycharm debug flags may be useful here? Or some kind of verbose mode.
(9) How to visualise effects on SF-12? Will only be tiny numerical differences for now (as only changing range of NewEthPop data used here), but would be good to understand how to do it for later in fertility development process. E.g. need new make target somewhere (outcomes/Makefile)
Are the current lineplots we have sufficient? This is a larger problem we're having at the moment for how to visualise the csv outputs. Discuss.
(99) Once everything here done, discuss duplicating functionality to mortality module, as very similar (e.g. rate table generation, as format of NewEthPop fertility and mortality input data is almost identical)
100% do this next. They're very similar with slight differences (e.g. men can die but not give birth).
Interpolated lookup diagram.
Another question, very trivial...
(10) Which have higher priority, interventions or mortality/fertility modules, based on text in default.yaml
and RunPipeline
?
Another question, very trivial...
(10) Which have higher priority, interventions or mortality/fertility modules, based on text in
default.yaml
andRunPipeline
?
From discussion, 20/04/23, priority is:
Added to list of jobs.
Closed with #259.
Issues to discuss/clarify with Rob and Luke during development, to be converted to jobs if agreed:
(1) Clarify difference b/t
key_columns
andparameter_columns
inadd_new_birth_cohorts.setup
-> DONE (see Rob's comment below) (2) Which is current,FertilityAgeSpecificRates
ornkidsFertilityAgeSpecificRates
? (Presumably the latter.) -> DONE (the latter) (3)Currently-> DONE as moved to job below (4)data_generation.convert_rate_data
, which generates rate table file, not called anywhere. How about calling during installation/setup to ensureregional
output files present? Or during fertility module initialisation?Should mean be weighted (by population? NewEthPop data exist) in-> DONE, as converted to job below (5) Are LA and region definitions current? If not, can use code from Inclusive Economy to generate lookups -> DONE as moved to new issue (#219), could be useful but not a priority for now as currently aggregated into region anyway (6) Why LAs used incollapse_LAD_to_region
(also,collapse_location
)?BaseHandler.compute_migration_rates
(presumably for migration modules?) but regions used in fertility module? -> DONE, as answered by Rob below (7)Think about how to generalise output/logging functionality in-> marking as DONE as vague and not priority; also at least partly covered by job below (8)RunPipeline
(and already a comment about it there) as very useful for me during fertility module development (cf. #167) -> already partly addressed in job below -> create new issue if good idea to add more detailed functionalityHow/where to generate/view specific variables during simulation, in the first instance fertility rate and year for which data is sought and year for which data are available?-> marking as DONE as (1) vague, (2) will become clearer over time and (3) at least partly covered by jobs below (9)How to visualise effects on SF-12? Will only be tiny numerical differences for now (as only changing range of NewEthPop data used here), but would be good to understand how to do it for later in fertility development process. E.g. need new make target somewhere (-> marking as DONE because vague and covered by jobs below (99)outcomes/Makefile
)Once everything here done, discuss duplicating functionality to mortality module, as very similar (e.g. rate table generation, as format of NewEthPop fertility and mortality input data is almost identical)-> DONE as podded off into another issue (#213)Rough to-do list:
data_generation.convert_rate_data
, as similar functionality already thereFutureWarning
indata_generation.convert_rate_data
; also podded off into #212 as called elsewhere (i.e. outside fertility module) as wellFertilityRateTable.__init__
when called fromadd_new_birth_cohorts
-> changed file to that containing all years, format is identical except year column added; only grabbing 2011-2012 thoughFertilityRateTable._build
, currently hard-coded to passyear_start
= 2011,year_end
= 2012 totransform_rate_table
get_nearest_year
functionality if nothing present, but where? Inutils
? Purpose is to get nearest year of data for a particular simulation year, in case that particular year isn't present in rate table. Put to new issue later if useful for other modules/general usedata_generation.convert_rate_data
called during fertility/mortality module initialisation if necessary (i.e. if cached file not present); also added to #213collapse_LAD_to_region
(also,collapse_location
), e.g. with NewEthPop population data -> marking as done here as moved to issue #218asfr
ultimately has all years of NewEthPop fertility data in memory -> just checked with print statement, not necessary to do anything more than that; also done for mortality, see #213add_new_birth_cohorts.nkidsFertilityAgeSpecificRates.setup
add year torequires_columns
(argument toregister_rate_producer
), but see (1) above -> marking as done as not necessary for year but is for parity? Added to #167 for nowview_columns
(argument toget_view
)? Also see (1) above -> don't need to add year, but do need to add parity? -> marking as done as not necessary for year but is for parity? Added to #167 for nowREGION.name
andETH.GROUP
inBaseHandler
; exact process not clear ATM, so add detail/new issue later; cf. (7) above -> marking as done as grouped into #220(and parity, which should just a dummy for now; to be addressed in another issue, cf. #167)to N-nested for loops inBaseHandler
to account for all-year rate tableitertools.combinations
(v. easy) -> moving to new issue (#217) but probably not necessary and not a priorityBaseHandler.cache
asrate_table_path
not defined by default -> marking as done as moved to #221RunPipeline
, move components map(s) outside of method/class in case useful elsewhere(to sort components by priority inpriority_sort
)RunPipeline
, re. Rob's warning in config files ->validate_and_sort_components
, called from ``RunPipeline```RunPipeline.RunPipeline
fornkidsFertilityAgeSpecificRates
and fertility by year (partly addresses one point in #167) -> marking as done as grouped into #220fertility_default.yaml
Configure config file to take either (a) new, compiled all-years fertility file, or (b) folder (of NewEthPop fertility data) rather than single file (depending on how development goes/discussion)-> marking as done as duplicate of another job above, and specifying folder rather than single file is unnecessarypriority_sort
functionality done, and add some alternative notes there -> only infertility_default.yaml
for nowscripts/Makefile
, for comparison with old fertility baseline -> currentlyfertility_testing
inscripts/Makefile