catalyst-cooperative / pudl

The Public Utility Data Liberation Project provides analysis-ready energy system data to climate advocates, researchers, policymakers, and journalists.
https://catalyst.coop/pudl
MIT License
456 stars 106 forks source link

Integrate oddball/unmapped FERC Plant names #492

Closed zaneselvans closed 2 years ago

zaneselvans commented 4 years ago

Now that all the weird missing respondents have been integrated, and all the old FERC Plant years have been integrated, we're left with a handful of unmapped plants. They seem to fall into three categories:

For the first two cases, we should probably just add the plants -- there may be some character encoding nonsense to deal with to get the special characters mapping correctly. For the "junk" names, in order to preserve any records that are associated with those data entry errors in case it's useful later, we can create a special "Junk" plant name. They'll still show up under their respective reporting utility, and if we end up working with that utility and its plant data later, we'll at least have the Junk Plant records available to integrate based on other cues (capacity, generation, etc.)

swinter2011 commented 4 years ago

I added 15 remaining unmapped plant to FERC EIA matching spreadsheet. Resolves two unmapped plants. 13 plants remain unmapped and do not appear in the DataPackages.

cmgosnell commented 2 years ago

@zaneselvans @swinter2011 is this done??

zaneselvans commented 2 years ago

It looks like this is no longer an issue, since we started injecting the dummy utility names that are missing. As a result we've been getting the unmapped plants and mapping them since 2019. There are some in 2020 that will need to be mapped but it looks like our current process catches these now.