Open MichaelTiemannOSC opened 3 years ago
Hmm, that does seem like unexpected behavior. Are you finding those expected Latitude values by looking at the spreadsheets directly? I wonder if it might be a datatype problem where some of the values are being stored as strings in the spreadsheet or something.
Yes, those are values copied and pasted directly from the EPA spreadsheets using MS-Excel.
Looks like there are 3789 cases in which there's a non-null longitude, but a null latitude. But only 8 cases where there's a non-null latitude and a null longitude. Seems like a weird skew. Really we want to treat this as a single geopoint, and keep the pair so long as it's within a certain distance of the average location or something like that.
Related: 3 of these have null latitude and the other 4 non-null latitude. But all seven have longitudes missing their minus signs:
https://data.catalyst.coop/pudl/plants_entity_eia?_sort=plant_id_eia&longitude__gt=0
Describe the bug
When I run this query in a Jupyter Notebook, I get a valid longitude, but NaN for a latidtude
plants_eia860.loc[:, ["plant_id_eia", "latitude", "longitude"]][plants_eia860.plant_id_eia==3317]
Bug Severity
How badly is this bug affecting you?
To Reproduce
Steps to reproduce the behavior -- Given above
settings.yml
file you're using to specify which data to load, and make a note of where in the ETL process the error is happening.I am using a fresh 01-pudl-parquet example Notebook and all its settings.
It seems to be wrong in all years.
Right from the start (once the pudl_engine is operative).
Expected behavior
A clear and concise description of what you expected to happen, or what you expected the data to look like.
In 2013 and 2014, the latitude was listed as 33.826667 In 2015 and later it was listed as 33.826655
That is well within the "round to 2 digits" and listen to 70% of the votes. BTW, this plant has the correct longitude value. I don't know what's messing up the latitude value.
Software Environment?
Jupyter Notebook on 2i2c
Additional context
Add any other context about the problem here.