Better interpret missing information from the source HIFLD substations CSV, to better produce bus demand estimates. This is a follow-up to #235.
What the code is doing
Values of the "ZIP" column which are "NOT AVAILABLE" are treated as null, rather than distinct string entries (this matters for calls to value_counts and groupby within assign_demand_to_buses)
Non-USA substations are filtered out (there are several in Canada), which allows the "COUNTYFIPS" column to be interpreted as int and meaningfully compared against the county population data in assign_demand_to_buses. Previously, with the Canadian substations being "NOT AVAILABLE", this column was interpreted as strings, rather than ints, which made the comparison of these entries vs. the ones from the county population data file meaningless (since those were ints).
Testing
Tested manually for feasibility: ERCOT remains feasible (0% load shedding), WECC reduced the load shedding from 12.7% of August demand to 10.8%.
The scatter for how the bus Pd values changed is below (left side is absolute, right side is log scale)
There are many substations which had been getting a large demand, and no longer are. There's a good deal of noise among the demand values for the smaller substations (<100 MW), but the larger ones seem pretty consistent as before.
Purpose
Better interpret missing information from the source HIFLD substations CSV, to better produce bus demand estimates. This is a follow-up to #235.
What the code is doing
"ZIP"
column which are"NOT AVAILABLE"
are treated as null, rather than distinct string entries (this matters for calls tovalue_counts
andgroupby
withinassign_demand_to_buses
)"COUNTYFIPS"
column to be interpreted asint
and meaningfully compared against the county population data inassign_demand_to_buses
. Previously, with the Canadian substations being"NOT AVAILABLE"
, this column was interpreted as strings, rather than ints, which made the comparison of these entries vs. the ones from the county population data file meaningless (since those were ints).Testing
Tested manually for feasibility: ERCOT remains feasible (0% load shedding), WECC reduced the load shedding from 12.7% of August demand to 10.8%.
The scatter for how the bus![image](https://user-images.githubusercontent.com/7348392/168185472-0748ed5c-d4e8-4aa5-9533-ca0dc0623081.png)
Pd
values changed is below (left side is absolute, right side is log scale)There are many substations which had been getting a large demand, and no longer are. There's a good deal of noise among the demand values for the smaller substations (<100 MW), but the larger ones seem pretty consistent as before.
Time estimate
15 minutes.