Open cmgosnell opened 4 years ago
To make this work we'll need a mapping of lat/lon values to states and/or county boundaries. States wouldn't be too large, but counties could be. States are certainly available as a public dataset.
Might make sense to do in conjunction with adding FIPS IDs (See issue #338).
The Python module stateplane does this.
At an even more basic level, it would be great to implement a sanity check that the coordinates are even in the U.S. or if they are bad coordinates. A rough bounding box for the continental U.S. is:
LONGITUDE_MIN = -126
LONGITUDE_MAX = -66
LATITUDE_MIN = 25
LATITUDE_MAX = 49
We recently checked data from the 2021 EIA-860 generators table against these for wind and solar plants and found multiple plants that had coordinates that were in the middle of the ocean or in China:
ba_code | plant_id_eia | generator_id | capacity_mw | latitude | longitude |
---|---|---|---|---|---|
ERCO | 62715 | WCCWF | 180.1 | 33.54915 | -33.550555 |
ISNE | 58279 | 1 | 5.0 | 42.219722 | -42.219722 |
ISNE | 58280 | 1 | 3.0 | 42.164722 | -42.164722 |
ISNE | 58282 | 1 | 4.8 | 42.222778 | -42.219722 |
ISNE | 58283 | 1 | 4.5 | 41.766389 | -41.766389 |
PJM | 59641 | 5MWPV | 5.0 | 36.468 | 77.592 |
DUK | 59929 | NB007 | 5.0 | 35.72 | 81.417 |
Some of these are easy to fix: in the case of the PJM and DUK generators, the issue is just that the longitude needs to have a negative sign in front of it (this can easily be verified with google maps satellite view). For the other ones, it looks like maybe the longitude value was missing and simply filled using the negative of the latitude value? These might need some manual searching of the plant.
It would be great if this could be screened and fixed as part of the ETL process for EIA-860.
It would be good to test whether or not the latitude and longitude are in the same state and/or county as the plant or utility entity. Coming from Issue #276.