Health Equity Tracker is a free-to-use data visualization platform that is enabling new insights into the impact of COVID-19 and other social and political determinants of health on historically underrepresented groups in the United States.
these future warnings occurred because .replace() used to automatically changed the column dtype if needed, for instance changing int to float if setting a value to None which can't be done for ints. However, in Pandas 3, it will no longer do this automatically. After lots of research, turns out the best solution is to adopt the future behavior now and fix any resulting errors. Then, once we upgrade to Pandas 3, we should be able to remove the added with lines that adopted the future behavior
updates to expect np.nan to represent all missing data in our expected_ dataframes
fixes #3279
fixes #2907
scopes the .replace() method in HIV to only string replace in the relevant age column. This prevents the warning because it's no longer trying to run that replace() method against float and int cols, only the known str col for age group
HIV REFACTOR
refactors to use source constants at the top of the file and then use those throughout. I'd like to adopt this pattern so it's really easy to tell at a glance which strings are coming from the source data (vs our outputted tables) and also label the source col with the same terms as what it will transform into like CDC_STATE_FIPS etc. so we know that's the source equivalent of std_col.STATE_FIPS_COL
Has this been tested? How?
expected data updated to have np.nan
use assert_frame_equal as better test for util fn
tests passing
Screenshots (if appropriate)
Types of changes
(leave all that apply)
Bug fix
Refactor / chore
New frontend preview link is below in the Netlify comment 😎
Description and Motivation
BACKGROUND
.replace()
used to automatically changed the column dtype if needed, for instance changingint
tofloat
if setting a value to None which can't be done for ints. However, in Pandas 3, it will no longer do this automatically. After lots of research, turns out the best solution is to adopt the future behavior now and fix any resulting errors. Then, once we upgrade to Pandas 3, we should be able to remove the addedwith
lines that adopted the future behaviornp.nan
to represent all missing data in our expected_ dataframes.replace()
method in HIV to only string replace in the relevant age column. This prevents the warning because it's no longer trying to run that replace() method against float and int cols, only the known str col for age groupHIV REFACTOR
CDC_STATE_FIPS
etc. so we know that's the source equivalent ofstd_col.STATE_FIPS_COL
Has this been tested? How?
Screenshots (if appropriate)
Types of changes
(leave all that apply)
New frontend preview link is below in the Netlify comment 😎