catalyst-cooperative / pudl

The Public Utility Data Liberation Project provides analysis-ready energy system data to climate advocates, researchers, policymakers, and journalists.
https://catalyst.coop/pudl
MIT License
471 stars 108 forks source link

ownership table reporting owner in operator column in 2010. #1116

Closed cmgosnell closed 3 years ago

cmgosnell commented 3 years ago

Describe the bug

For 730 plant/generator/year records (which is about 1.9% of the total), the utility_id_eia column is being reported as the owner_utility_id_eia. See our column description for more details here. For whatever reason, there records are all in 2010.

This problem makes merging this table with the generators difficult because we expect the utility_id_eia to be the operator (not the owner) in both tables.

This is probably also a source of the inconsistencies in the harvested utility_id_eia.

Bug Severity

How badly is this bug affecting you?

To Reproduce

With a fully loaded database, using the pudl output object:

image

I also checked pulling directly for the database and saw the same issue.

zaneselvans commented 3 years ago

Proposed solution

In addition utility_id_eia should not be in the annual generators_eia860 table -- it is entirely determined by the plant_id_eia which the generator_id is associated with. See #1266