The Public Utility Data Liberation Project provides analysis-ready energy system data to climate advocates, researchers, policymakers, and journalists.
Both the EIA-860 and EIA-861 report information about the kind of entity each utility is -- how they're owned, what their primary business model is, etc. However, the categories they use aren't quite identical and they have evolved over time, so it's not entirely clear what we should do with this data. Should it be harvested and combined into a single annually varying field that's associated with the Utility ID? Or are they really describing separate (though related) attributes which should be tracked independently?
Note: under our new naming conventions, this table has become core_eia__codes_entity_types
EIA-861
sales_eia861
Uses full category names in Title Case. Note that some other EIA-861 tables use the 1-2 letter short codes instead!
The set of categories used seem to have changed slightly between 2001-2006, 2007-2011, and 2012-present.
Comparing the entity_type reported by year and utility ID in sales_eia861 with utilities_eia860 there are 16521 unique combinations, of which only 289 have different entity_type values. God it's so close.
In 202/289 cases the entity_type is reported as political_subdivision in the EIA-861, but has more specificity in the EIA-860 -- mostly being reported as state, municipal, or federal.
A little more than half of the remaining values are reported as independent_power_producer in the EIA-860, but have more specificity in the EIA-861, mostly showing up as retail_power_marketer or behind_the_meter.
From 2001-2006 the sales_eia861 has an entity type called facility which seems to switch in 2007-2011 to being called unregulated and these seem like they may correspond to the modern independent_power_producer category which is described as "Independent power producer or qualifying facility" in the EIA-861 instructions. However it seems that none of these utilities appear in later years of data so it's hard to be sure. Many of them appear to be cogeneration facilities.
Almost all of the remaining utility IDs that have more than one entity_type associated with them are due to changes from the generic power_marketer to the more specific retail_power_marketer in 2007, and then a smattering of switches between utilities being retail_power_marketer and wholesale_power_marketer
Other EIA-861 tables containing entity_type:
advanced_metering_infrastructure_eia861
demand_side_management_misc_eia861
mergers_eia861
operational_data_misc_eia861
reliability_eia861
utility_data_misc_eia861
Values that appear in those tables
import json
all_entity_types_eia861 = {
x: sorted(pudl_out._dfs[x].entity_type.dropna().unique())
for x in pudl_out._dfs if "entity_type" in pudl_out._dfs[x].columns
}
print(json.dumps(all_entity_types_eia861, indent=4, sort_keys=True))
Reassign facility and unregulated to independent_power_producer in the early years.
Use the more specific retail_power_marketer entity type where possible on older records that just say retail_power_marketer
Would it be reasonable / correct to use the more specific state / federal / municipal categorizations in place of political_subdivision when that's how utilities are identified in one of the datasets?
Is there any reasonable reconciliation to be had in the mismatched independent_power_producer records that show up as behind_the_meter or retail_power_marketer?
Can we find older versions of the form instructions for both EIA-860 and EIA-861 to try and figure out whether these should really be distinct columns?
What does the entity_type column look like in the other EIA-861 tables?
Changes in entity_type from year to year are extremely rare. Are they real, or reporting errors?
Both the EIA-860 and EIA-861 report information about the kind of entity each utility is -- how they're owned, what their primary business model is, etc. However, the categories they use aren't quite identical and they have evolved over time, so it's not entirely clear what we should do with this data. Should it be harvested and combined into a single annually varying field that's associated with the Utility ID? Or are they really describing separate (though related) attributes which should be tracked independently?
Note: under our new naming conventions, this table has become
core_eia__codes_entity_types
EIA-861
sales_eia861
entity_type
reported by year and utility ID insales_eia861
withutilities_eia860
there are 16521 unique combinations, of which only 289 have differententity_type
values. God it's so close.entity_type
is reported aspolitical_subdivision
in the EIA-861, but has more specificity in the EIA-860 -- mostly being reported asstate
,municipal
, orfederal
.independent_power_producer
in the EIA-860, but have more specificity in the EIA-861, mostly showing up asretail_power_marketer
orbehind_the_meter
.sales_eia861
has an entity type calledfacility
which seems to switch in 2007-2011 to being calledunregulated
and these seem like they may correspond to the modernindependent_power_producer
category which is described as "Independent power producer or qualifying facility" in the EIA-861 instructions. However it seems that none of these utilities appear in later years of data so it's hard to be sure. Many of them appear to be cogeneration facilities.entity_type
associated with them are due to changes from the genericpower_marketer
to the more specificretail_power_marketer
in 2007, and then a smattering of switches between utilities beingretail_power_marketer
andwholesale_power_marketer
Other EIA-861 tables containing
entity_type
:advanced_metering_infrastructure_eia861
demand_side_management_misc_eia861
mergers_eia861
operational_data_misc_eia861
reliability_eia861
utility_data_misc_eia861
Values that appear in those tables
Possible Actions / Questions:
facility
andunregulated
toindependent_power_producer
in the early years.retail_power_marketer
entity type where possible on older records that just sayretail_power_marketer
political_subdivision
when that's how utilities are identified in one of the datasets?independent_power_producer
records that show up asbehind_the_meter
orretail_power_marketer
?entity_type
column look like in the other EIA-861 tables?entity_type
from year to year are extremely rare. Are they real, or reporting errors?