isudatateam / datateam

ISU Data Team Effort
MIT License
5 stars 3 forks source link

Standardize ISBI Data Number 2 #194

Closed loriabendroth closed 4 years ago

loriabendroth commented 4 years ago

@giorgichi, can you standardize a new data set for ISBI (https://drive.google.com/drive/u/0/folders/1xDqvrgsXy7z45wJnvtRCYEnStjRQh7Qx). This is complementary to another so want to use same headers and variables as possible.

Tasks include:

Additional standardizing within some of the columns will be done in the future. We will also need to cross-check this data for those not reported within DN1 by water coordinators so we can have a true total number of events held.

giorgichi commented 4 years ago

Add columns for watershed and contract number as in DN1 and populate based on the town/county information. Add yes/no column if entry is not within our priority watersheds so we can filter out later.

I am not sure if it is generally accurate to map events with the watersheds. This should be clarified with Lori and Jamie.

giorgichi commented 4 years ago

Corrected and partially standardized location names:

loriabendroth commented 4 years ago

@giorgichi The events will be allocated to a particular watershed so even if participants come from a larger area, we will apply it to one watershed only. Is that what you are getting at with your comment?

loriabendroth commented 4 years ago

For town names, we need to follow some standard across all sheets. Can you bring in a standard and cross-check against what we have here?
Here is list of city names for Iowa: https://gist.github.com/Jonathonbyrd/536074 Or from Census: https://www.census.gov/data/tables/time-series/demo/popest/2010s-total-cities-and-towns.html

loriabendroth commented 4 years ago

Having lat/long with cities is needed at some point also. https://github.com/kelvins/US-Cities-Database

giorgichi commented 4 years ago

@giorgichi The events will be allocated to a particular watershed so even if participants come from a larger area, we will apply it to one watershed only. Is that what you are getting at with your comment?

@loriabendroth - I should have stated my question clearer. What I wanted to say is - if a location (city/town) is not within the boundaries of the watershed, does it automatically mean that it is not related to that watershed (even if it is close to the watershed)?

loriabendroth commented 4 years ago

Got it! I think we have to be very strict in this sense otherwise it becomes a slippery slope of what to include. If it is not within the bounds of the watershed, we will NOT include it now.

loriabendroth commented 4 years ago

Used 25 mile centroid approach.