catalyst-cooperative / pudl

The Public Utility Data Liberation Project provides analysis-ready energy system data to climate advocates, researchers, policymakers, and journalists.
https://catalyst.coop/pudl
MIT License
471 stars 108 forks source link

`plants_eia860` is incorrectly assigning `balancing_authority_code_eia` for plants in ISNE #2255

Closed grgmiller closed 1 year ago

grgmiller commented 1 year ago

It appears that for some reason, the plants_eia860 output is incorrectly assigning the balancing authority code for several plants that are identified by EIA-860 as being in ISNE, but pudl is assigning as NYIS: image

For example, here is the output of plants_eia860 for one of those plants: image

For this specific plant, the ba code for this plant has been listed as ISNE for at least the past three years in EIA-860.

I'm not sure why this is happening, but it is affecting data for a number of plants.

grgmiller commented 1 year ago

To be clear, some of these plants are located right on the border of NYIS and ISNE, so could conceivably be located in either. For example, 6456 is a hydro plant that is located on the river between NY and VT.

However, in other cases, projects are located deep in the heart of NY nowhere close to ISNE.

zaneselvans commented 1 year ago

This is definitely odd. It's not even an output / filling thing -- it's like this in the DB table.

plants_eia860 = pd.read_sql("plants_eia860", pudl_engine).convert_dtypes(convert_floating=False)
print(plants_eia860[plants_eia860.plant_id_eia == 6456][[
    "plant_id_eia",
    "report_date",
    "balancing_authority_code_eia",
    "balancing_authority_name_eia",
]].to_markdown())
plant_id_eia report_date balancing_authority_code_eia balancing_authority_name_eia
11378 6456 2022-01-01 NYIS
25030 6456 2021-01-01 NYIS ISO New England Inc.
37668 6456 2020-01-01 NYIS ISO New England Inc.
49518 6456 2019-01-01 NYIS ISO New England Inc.
60522 6456 2018-01-01 NYIS ISO New England Inc.
70676 6456 2017-01-01 NYIS ISO New England Inc.
80410 6456 2016-01-01 NYIS ISO New England Inc.
89364 6456 2015-01-01 NYIS ISO New England Inc.
97906 6456 2014-01-01 NYIS ISO New England Inc.
105983 6456 2013-01-01 NYIS ISO New England Inc.

@cmgosnell I think this must be coming from pudl.transform.eia.fix_balancing_authority_codes_with_state() which appears to make a blanket assumption that all plants in NY state must also be part of NYISO, which got rolled in with some fixes to the BA codes for PACW / PACE when the states were clearly wrong.

It seems like we should also be updating the BA names when we're fixing these things.

Did we have external evidence suggesting that these plants were reporting the wrong ISO? Or was it just the state vs. ISO name mismatch?

cmgosnell commented 1 year ago

these overrides of the codes happened via #1911 with the discussion in #1909. As far as i can see yes we only updated the code - not the name during the process.

It seems like there are two courses of action here that we should probably employ:

Is there any good way to actually know if this is a data reporting problem vs they are actually "in" ISNE?

grgmiller commented 1 year ago

ISONE's asset listing may be a way to help confirm this: https://www.iso-ne.com/participate/participant-asset-listings/

zaneselvans commented 1 year ago

Sounds like a job for the new spot-fixer!

e-belfer commented 1 year ago

More thorough answer still in progress, but came across NYISO's generator listings: https://www.nyiso.com/documents/20142/2226333/2022-Gold-Book-Final-Public.pdf/cd2fb218-fd1e-8428-7f19-df3e0cf4df3e

e-belfer commented 1 year ago

If I'm understanding correctly, my short takeaway from reviewing selected plants on the list is that some of these facilities genuinely should be assigned to ISNE. A few examples:

And yet others should probably genuinely be assigned to NYIS. See for example:

And so on. In the absence of a clear pattern one way or another, we could go through this list manually or we could default to the ISO codes provided by EIA for NY state facilities. I'm presuming this level of manual validation isn't standard for PUDL, but it isn't an outrageous number of plants to assign. Thoughts? @cmgosnell @zaneselvans

grgmiller commented 1 year ago

but it isn't an outrageous number of plants to assign

Do we know if this is limited to this NYISO/ISONE issue, or are there other locations across the country where a plant is located in a state that doesn't seem to match with its assigned ISO?

One source of this discrepancy could be something that we discovered when working with EIA-930 data: that there are both "physical" and "commercial" definitions of balancing authorities (see our documentation of this issue here).

Perhaps plants like carver falls and fisher's island are considered in the ISONE commercial boundary, but the NYISO physical boundary.

zaneselvans commented 1 year ago

I don't think we have any idea the extent to which the BAs are incorrectly assigned. IIRC the apparent PACW/PACE errors were creating an issue in some other analysis that @cmgosnell was doing, and in investigating that we happened to notice the NYISO/NEISO weirdness too, and tacked this "fix" on. I think Christina might have more context on what it was that necessitated the fix (maybe some BA-level aggregations for RMI?).

cmgosnell commented 1 year ago

I think reverting the NYSIO override is a fine idea! Unless there are very clear known issues that we want to spot fix. The discussion of the exploration of these fixes is in #1909. I felt ambiguous about it then, so reverting to whatever they report seems like a good idea.

On the name fix, we can and should fix the names at the name moment we fix these non-NY codes. Or we could strip the name out of the tables altogether and make an encoder table. Or even an entity table and harvest the names. I think the latter is probably the right thing to do.

zaneselvans commented 1 year ago

I think I agree that we kind overstepped on trying to fix the NYISO/NEISO ambiguities and we should probs just drop that fix. Since we're not gonna go through and try to research and fix all the possible errors.

e-belfer commented 1 year ago

Reverted the NYISO fix and updated names to match the BA code when we do change them in PR #2312 .