singularity-energy / open-grid-emissions

Tools for producing high-quality hourly generation and emissions data for U.S. electric grids
MIT License
67 stars 4 forks source link

Fix and add information to plant static attributes #368

Closed rouille closed 5 days ago

rouille commented 1 week ago

Purpose

Fix positive longitudes and erroneous time zones of plants, add missing coordinates and location (state, county and city) of plants and save the plant static attributes to the open_grid_emissions_data/results/plant_data folder. Closes CAR-4339, CAR-4340, CAR-4341.

What the code is doing

Testing

Successfully ran the 2005 pipeline

Where to look

The oge.helpers module has most of the changes

Usage Example/Visuals

Comparing 2005 plant static attributes files before changes and after.

The code snippet shows that erroneous longitude and timezone are fixed

>>> psa_old = pd.read_csv("~/Desktop/2005_outputs/plant_static_attributes_2005.csv", usecols=["plant_id_eia", "latitude", "longitude", "state", "county", "city", "timezone"], index_col=0)
>>> psa_new = pd.read_csv("~/open_grid_emissions_data/results/2005/plant_data/plant_static_attributes.csv", usecols=["plant_id_eia", "latitude", "longitude", "state", "county", "city", "timezone"], index_col=0)
>>> psa_old[psa_old["longitude"] > 0]
             state          county            city   latitude   longitude             timezone
plant_id_eia                                                                                  
50060           MD  Prince Georges  Upper Marlboro  38.824863   76.772200          Asia/Urumqi
54537           WA         Whatcom        Ferndale  48.828996  122.686666  America/Los_Angeles
55964           MD  Prince Georges  Upper Marlboro  38.847585   76.776400          Asia/Urumqi
>>> psa_new.loc[psa_old[psa_old["longitude"] > 0].index]
             state          county            city   latitude   longitude             timezone
plant_id_eia                                                                                  
50060           MD  Prince Georges  Upper Marlboro  38.824863  -76.773149     America/New_York
54537           WA         Whatcom        Ferndale  48.828996 -122.685114  America/Los_Angeles
55964           MD  Prince Georges  Upper Marlboro  38.847585  -76.785912     America/New_York

The code snippet below shows that most of the missing coordinates and location have been fille out:

>>> psa_old.isna().sum()
state         1
county       63
city         12
latitude     46
longitude    30
timezone      0
dtype: int64
>>> psa_new.isna().sum()
state         0
county        4
city         12
latitude      1
longitude     1
timezone      0
dtype: int64
>>> 

Remaining missing information are:

>>> psa_new[psa_new["county"].isna() | psa_new["city"].isna() | psa_new["latitude"].isna() | psa_new["longitude"].isna()]
             state       county        city   latitude   longitude             timezone
plant_id_eia                                                                           
414             CA     Tuolumne         NaN  38.202569 -120.077000  America/Los_Angeles
415             CA     Tuolumne         NaN  38.246656 -120.034100  America/Los_Angeles
603             DC          NaN  Washington  38.899400  -76.959200     America/New_York
3520            TX        Pecos         NaN  30.683611 -102.802800      America/Chicago
7253            SC  NOT IN FILE     unsited        NaN         NaN     America/New_York
7338            CA       Plumas         NaN  39.889287 -121.279200  America/Los_Angeles
10125           NY          NaN     Jamaica  40.702913  -73.800643     America/New_York
10159           MI    Allegheny         NaN  40.512778  -79.800830     America/New_York
10377           VA          NaN    Hopewell  37.293900  -77.269700     America/New_York
10458           CA       Lassen         NaN  40.976389 -121.255800  America/Los_Angeles
13213           MS        Union         NaN  34.541100  -88.942200      America/Chicago
50242           GA       Newton         NaN  33.570092  -83.893920     America/New_York
54088           NY     Saratoga         NaN  43.250000  -73.814400     America/New_York
54650           CA    Riverside         NaN  33.721999 -116.037247  America/Los_Angeles
54767           NY          NaN    Brooklyn  40.670556  -73.936390     America/New_York
54934           PA   Lackawanna         NaN  41.436308  -75.589700     America/New_York
55316           IL        Logan         NaN  40.079661  -89.433729      America/Chicago

Review estimate

15min

Future work

N/A

Checklist