cga-wm / advgis-beta

This is Advanced GIS at W&M - beta version.
0 stars 0 forks source link

Query and Join data from two tables #25

Open parthree opened 4 years ago

parthree commented 4 years ago

OK, I have a lot of experience working with relational databases in Access, but this seems like a good basic query to try to learn in Python. I have two data tables:

My question is how to query the weather table to pluck out and join the weather data ONLY for the date AND location for each incident. The weather station name and data format matches in both tables and the field is called NAME_WEATHER. Same for the date field.

dt-woods commented 4 years ago

@parthree: the daily weather resource you linked to only has about 50 weather stations listed. Is there another resource that has the 120k+ records?

parthree commented 4 years ago

https://wm1693.box.com/s/8ro39tev5t84370w2m62jriab6pjhkqo

dt-woods commented 4 years ago

Move a copy onto AGOL w/ sharing here: https://wm-gis.maps.arcgis.com/home/item.html?id=1d4ccf034197404293d0337da6e72db9

dt-woods commented 4 years ago

Click here for example notebook.

If you kept the Weather Station CSV, you could join the output from the above to the stations (STATION ID?) to get the longitude and latitude and create a spatial dataframe for mapping.

parthree commented 4 years ago

If I understand your comment correctly, I believe I already did that. I got an exhaustive list of weather stations from the NOAA query in my AOI and date range, then did a spatial join for the closest station to each shark incident (in Pro, and that's not short for programming lol). The results of that join gave me the name of the station to get daily weather data from. It ended up being about 14 different stations that I used for the daily query. So there were two NOAA queries, one to get the station name and locations, and the other to get daily data from specific stations (no location needed because the station name was already joined to the attack data). Does that make sense?

dt-woods commented 4 years ago

Yeah. That makes sense. I was thinking that with the WeatherStations.csv, you could join the table to get the Lon/Lat pairs, create a spatial dataframe (in pandas) and directly make a point-based feature class. You can also just do an attribute join in ArcGIS once you have your updated CSV.

parthree commented 4 years ago

Took me 2 minutes to get the tables joined in Access:

SELECT JoinGHCN_Table.Case_Number, JoinGHCN_Table.Moon_Phase, JoinGHCN_Table.Water_Clarity, JoinGHCN_Table.OtherNotes, JoinGHCN_Table.Encounter, JoinGHCN_Table.Confirmation, JoinGHCN_Table.Date, JoinGHCN_Table.Year, JoinGHCN_Table.Location, JoinGHCN_Table.Activity, JoinGHCN_Table.Fatal__YN, JoinGHCN_Table.Time, JoinGHCN_Table.Species, JoinGHCN_Table.Investigator_or_Source, JoinGHCN_Table.NAME_WEATHER, WeatherData.PRCP, WeatherData.TMAX, WeatherData.TMIN FROM JoinGHCN_Table LEFT JOIN WeatherData ON (JoinGHCN_Table.Date = WeatherData.DATE) AND (JoinGHCN_Table.NAME_WEATHER = WeatherData.NAME_WEATHER); JoinQuery

dt-woods commented 4 years ago

Nice one, Deb. I'll admit, Pandas does support something proximal to SQL, but you just can't beat the real deal.