USEPA / camd-eia-crosswalk

A data crosswalk to integrate U.S. power sector emission and operation data from EPA to EIA
MIT License
34 stars 11 forks source link

Explain duplicates in README #12

Open j-tafoya opened 3 years ago

j-tafoya commented 3 years ago

When either EPA or EIA has more identifiers for the same units, this creates "duplicate" outputs in the crosswalk.

For example, the following two plants, included in the manual match file, with IDs 52151 and 7903 have duplicates for CAMD units and EIA units respectively.

CAMD_PLANT_ID CAMD_UNIT_ID CAMD_GENERATOR_ID EIA_PLANT_ID EIA_BOILER_ID EIA_GENERATOR_ID
52151 001 GEN1 52151 PB1 GEN1
52151 001 GEN1 52151 RF1 GEN1
52151 001 GEN2 52151 PB2 GEN2
52151 001 GEN2 52151 RF2 GEN2
7903 MGS1A MGS1A 7903   MGS1
7903 MGS1B MGS1B 7903   MGS1
7903 MGS2A MGS2A 7903   MSG2
7903 MGS2B MGS2B 7903   MSG2
j-tafoya commented 3 years ago

Include explanation about these occurrences. See PLANT_ID 7903 and 52151 for an example