Closed aesharpe closed 7 months ago
Looping in @robertozanchi
Last fix here is related to #2448 - Updated the file map to say Final instead of Early Release and actually extracted this raw table (it had been blocked due to issues with the 2018 archive). See #3100
Annual Updates Docs: https://catalystcoop-pudl.readthedocs.io/en/dev/dev/annual_updates.html
pudl/workspace/datastore.py
.pudl_datastore --dataset eia923
. The new raw data will appear inpudl_input/eia923/<ZENODO_DOI>/...
pudl/package_data/eia923
if necessary:dagster-webserver -m pudl.etl
and then openhttp://127.0.0.1:3000/locations/pudl.etl/jobs/etl_full
in a browser)raw_eia923
asset group. Look out for warnings in the logs about missing or extra columns. If they appear, check and update thepackage_data
accordingly._core_eia923
asset group. Look out for warnings and fix accordingly.norm_eia
and thendenorm_eia
asset groups. You'll probably see some errors related to encoding. Take a look at which column it's talking about and look atmetadata/resources/eia.py
to see which encoder inCODE_METADATA
to tweak.test_minmax_rows
intest/validate/eia_test.py
. Sometimes it helps to just run the test (pytest test/validate/eia_test.py::test_minmax_rows
) in the terminal because it will print out how many rows it found vs. how many it expected and you can put the found rows into the code so they become expected rows. Make sure none of the rows have less rows than before. Also make sure none of the row changes are unexpectedly large.tox
and troubleshoot what else might be broken! Might include things like: