Compare mapping of (plant_id_eia, generator_id) to subplant_id across years. Closes CAR-3518
What the code is doing
Load the sub-plant crosswalk files for years 2019 through 2022. For each year combination, the intersection of (plant_id_eia, generator_id) is retrieved and occurrences where the subplant_id values differ are saved as a pandas data frame.
Testing
N/A
Where to look
A new notebook named validate_subplant_crosswalk encloses this short analysis
Number of difference in subplant_id for same (plant_id_eia, generator_id) combination
2019-2020: 448
2019-2021: 895
2019-2022: 1193
2020-2021: 488
2020-2022: 831
2021-2022: 435
Review estimate
10min
Future work
N/A
Checklist
[x] Update the documentation to reflect changes made in this PR
[x] Format all updated python files using black
[x] Clear outputs from all notebooks modified
[x] Add docstrings and type hints to any new functions created
Purpose
Compare mapping of (
plant_id_eia
,generator_id
) tosubplant_id
across years. Closes CAR-3518What the code is doing
Load the sub-plant crosswalk files for years 2019 through 2022. For each year combination, the intersection of (
plant_id_eia
,generator_id
) is retrieved and occurrences where thesubplant_id
values differ are saved as apandas
data frame.Testing
N/A
Where to look
A new notebook named
validate_subplant_crosswalk
encloses this short analysisUsage Example/Visuals
See notebook outputs here, e.g.,
Review estimate
10min
Future work
N/A
Checklist
black