Open capnrefsmmat opened 3 years ago
Hey @alexcoda want to take a look at this? Cheryl has a good start in #1213 but it's missing the critical plumbing to actually do the geographic aggregations (it currently just outputs extra copies of the state df under different names).
I've attached a csv file which is the result of recently running pull.pull_nchs_mortality_data
with our Socrata key for you to use for testing.
@krivard yep! I'll take a crack at it sometime this weekend. I'll let you know if I have any more questions about it
Looking at the csv columns, there's a mix of counts and percent values (from the source), so presumably we'll have to do something like the following to do a weighted average of the percentages and a nonweighted sum for the counts? And NCHS should cover all states that are included in HHS regions, so we don't need to worry about weird denominator handling right?
df"weight"] = df["population"]
proportion_vals = gmpr.replace_geocode(df, "state_id", new_geo, ... , date_col="timestamp", data_cols=[<all columns that are a percentage>]
# weight column gets removed
count_vals = gmpr.replace_geocode(df, "state_id", new_geo,..., date_col="timestamp", data_cols=[<all columns that are a counts>]
# combine the two dataframes back
We ended up going down a huge rabbit hole after realizing geomapper doesnt do state_id to fips, only state_code, but will continue to finish this next week.
presumably we'll have to do something like the following to do
that looks right, yes.
NCHS should cover all states that are included in HHS regions
That's correct; here's the NCHS coverage map for reference
(and the like five different ways of specifying states is indeed a pain)
The NCHS mortality data is currently only available at the state level. It seems like it should be possible to aggregate it to the nation and HHS levels. (If it's not possible for some reason, we should document that so nobody tries to use state and aggregate themselves.) Having all our signals consistently available at HHS and nation when possible would make it easy to compare things.
There's no pressing need for this that I know of; I just noticed the inconsistency and think it'd be nice to fix.