Closed zaneselvans closed 4 years ago
Add plant_id_eia 60237 to that list as per #311
@joshdr83 how do you know they're wrong? The 60265 lat/lon values match across EIA860 (2016 & 2017) and plants_entities_eia
. Those also line up with the county (Hunterdon, NJ) listed in NEEDS, and google maps shows a bunch of solar panels at that location.
I was cross-checking them against the 2___Plant_Y2017.xlsx sheet and the plants_entities_eia table? Did I mess one up? -Josh
On Jun 13, 2019, at 11:22 AM, Greg Schivley notifications@github.com wrote:
@joshdr83 https://github.com/joshdr83 how do you know they're wrong? The 60265 lat/lon values match across EIA860 (2016 & 2017) and plants_entities_eia. Those also line up with the county (Hunterdon, NJ) listed in NEEDS, and google maps shows a bunch of solar panels at that location https://www.google.com/maps/place/40%C2%B031'25.3%22N+74%C2%B050'36.2%22W/@40.523139,-74.845422,824m/data=!3m1!1e3!4m5!3m4!1s0x0:0x0!8m2!3d40.5237!4d-74.843398.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/catalyst-cooperative/pudl/issues/309?email_source=notifications&email_token=AAZQFTFJW4KQ7IOSKCQFB2TP2J64NA5CNFSM4HXU6EX2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXUNTOY#issuecomment-501799355, or mute the thread https://github.com/notifications/unsubscribe-auth/AAZQFTA4Y64MSNUMLDNQGC3P2J64NANCNFSM4HXU6EXQ.
Hi @joshdr83 ! I'll look into this shortly... if you find any other lat/long inconsistencies drop them here!
60265 first shows up in the 2015 version of 860 with the wrong lat/lon (45.400524, -122.765822). Fun fact: that location is just down the street from a bouldering gym.
@zaneselvans and @cmgosnell is PUDL building the plants_entities_eia
table using the first occurrence in EIA860? If so, these could all be issues with location in at least the first year. Could be solved by an error check to see if it's in the right state (did @karldw do something like this for CEMS timezones?) and then checking subsequent years. Or check all available years to see if they match.
Ha! bouldering gyms are not power plants in the traditional sense!
they are actually being "harvested" of sorts from all of their occurrences across 860 and 923. We take the most consistently reported record and if the records are not 70% consistent then we don't bring in anything.
Lat/long is the most messy/inconsistent so I made a tiny exception that rounds down the accuracy. That's definitely not the best way to do that so if either of you have any suggestions on that. But I don't think the rounding is an issue doesn't seem like what is happening in these cases. These just seem broken, which makes me want to generally debug the harvesting process.
Without the time to look closely at the existing code, I'd do something like this for lat/long:
Not sure if it's worth the trouble to do all of those checks/calculations for a few plants though.
@gschivley, I didn't do that check, but it shouldn't be terribly difficult. Take care if you use counties though, because they've changed a little over time.
subsumed within Issue #446
Describe the bug The latitude & longitude values in the
plants_entity_eia
table are incorrect forplant_id_eia
values 60265 and 60266, as reported by Josh Rhodes.