catalyst-cooperative / pudl

The Public Utility Data Liberation Project provides analysis-ready energy system data to climate advocates, researchers, policymakers, and journalists.
https://catalyst.coop/pudl
MIT License
470 stars 108 forks source link

Reconcile `entity_types_eia` / `entity_type` / ownership category discrepancies in eia860 & eia861 #1392

Open zaneselvans opened 2 years ago

zaneselvans commented 2 years ago

Both the EIA-860 and EIA-861 report information about the kind of entity each utility is -- how they're owned, what their primary business model is, etc. However, the categories they use aren't quite identical and they have evolved over time, so it's not entirely clear what we should do with this data. Should it be harvested and combined into a single annually varying field that's associated with the Utility ID? Or are they really describing separate (though related) attributes which should be tracked independently?

Note: under our new naming conventions, this table has become core_eia__codes_entity_types

EIA-861

sales_eia861

Other EIA-861 tables containing entity_type:

Values that appear in those tables

import json
all_entity_types_eia861 = {
    x: sorted(pudl_out._dfs[x].entity_type.dropna().unique())
    for x in pudl_out._dfs if "entity_type" in pudl_out._dfs[x].columns
}
print(json.dumps(all_entity_types_eia861, indent=4, sort_keys=True))
{
    "advanced_metering_infrastructure_eia861": [
        "Behind the Meter",
        "Cooperative",
        "Federal",
        "Investor Owned",
        "Municipal",
        "Political Subdivision",
        "Retail Power Marketer",
        "State"
    ],
    "demand_side_management_misc_eia861": [
        "1",
        "2",
        "3",
        "4",
        "5",
        "6",
        "7",
        "8",
        "Cooperative",
        "DSM Administrator",
        "Federal",
        "Investor Owned",
        "Municipal",
        "Municipal Mktg Authority",
        "Political Subdivision",
        "Power Marketer",
        "Retail Power Marketer",
        "State"
    ],
    "mergers_eia861": [
        "C",
        "I",
        "M",
        "P",
        "R",
        "W"
    ],
    "operational_data_misc_eia861": [
        "Behind the Meter",
        "Community Choice Aggregator",
        "Cooperative",
        "Federal",
        "Investor Owned",
        "Municipal",
        "Municipal Mktg Authority",
        "Other",
        "Political Subdivision",
        "Power Marketer",
        "Private",
        "Retail Power Marketer",
        "State",
        "Transmission",
        "Unknown",
        "Wholesale Power Marketer"
    ],
    "reliability_eia861": [
        "Cooperative",
        "Investor Owned",
        "Municipal",
        "Political Subdivision",
        "State"
    ],
    "sales_eia861": [
        "Behind the Meter",
        "Community Choice Aggregator",
        "Cooperative",
        "Facility",
        "Federal",
        "Investor Owned",
        "Municipal",
        "Political Subdivision",
        "Power Marketer",
        "Retail Power Marketer",
        "State",
        "Unregulated",
        "Wholesale Power Marketer"
    ],
    "utility_data_misc_eia861": [
        "A",
        "Behind the Meter",
        "C",
        "Community Choice Aggregator",
        "Cooperative",
        "F",
        "Federal",
        "I",
        "Investor Owned",
        "M",
        "Municipal",
        "Municipal Mktg Authority",
        "P",
        "Political Subdivision",
        "R",
        "Retail Power Marketer",
        "S",
        "State",
        "T",
        "Transmission",
        "Unknown",
        "W",
        "Wholesale Power Marketer"
    ]
}

Possible Actions / Questions:

zaneselvans commented 2 years ago

See also related issues #1841 and #669