usds / justice40-tool

A tool to identify disadvantaged communities due to environmental, socioeconomic and health burdens
https://screeningtool.geoplatform.gov/
Creative Commons Zero v1.0 Universal
130 stars 42 forks source link

As a data developer, I want to easily update the data pickles #1048

Open lucasmbrown-usds opened 2 years ago

lucasmbrown-usds commented 2 years ago

Description If we added a simple global constant like UPDATE_PICKLES = False to the data/data-pipeline/data_pipeline/etl/score/tests/test_score_post.py file, then when you need to update the pickles, you just set that parameter to True.

And within each test, it would be something like:

if UPDATE_PICKLES:
  data_path = Path.cwd()
  score_transformed_actual.to_pickle(data_path / "data_pipeline" / "etl" / "score" / "tests" / "snapshots" / "score_transformed_expected.pkl", protocol=4)
  pytest.fail("Test fails because `UPDATE_PICKLES` is True.")
else:
   pdt.assert_frame_equal(
        score_transformed_actual, score_transformed_expected, check_dtype=False
    )

We would never set UPDATE_PICKLES = True in the actual code in the app, it would just be used to temporarily update these things.

Instead, copy or expand upon framework developed by https://github.com/usds/justice40-tool/pull/1249.

lucasmbrown-usds commented 2 years ago

Potentially copy or expand upon framework developed by https://github.com/usds/justice40-tool/pull/1249