nickeubank / mtv_viacom_capstone

1 stars 0 forks source link

Nearest Polling Place Distance Calculation #26

Closed jgy4 closed 2 years ago

jgy4 commented 3 years ago

Hello all, and especially @nickeubank ! This issue is to follow up on the problem of calculating the distance to a nearest polling place (returning zero if within the polygon, and a measurable distance otherwise).

Currently I have two data frames used in the calculation, polling_gdf and subset_college: polling = geopandas.read_file('../00_source_data/2020 Polling Data/polling_pk_master_post.csv')

polling_gdf = geopandas.GeoDataFrame(polling, geometry=geopandas.points_from_xy(polling.longitude, polling.latitude))

subset_college = geopandas.read_file('../20_intermediate_files/subset_final_college_polygon.csv', GEOM_POSSIBLE_NAMES="geometry", KEEP_GEOM_COLUMNS="NO")

I'm using '4326' for the projection: subset_college.crs = 4326 polling_gdf.crs = 4326

I'm using this function to grab the distance to the nearest polling place: def nearest_poll_dist(row, polling_df): polling_distance = polling_df['geometry'].distance(row.geometry).sort_values().values[0] return polling_distance

Currently this function is looking at all polling place geometries, calculating the distance from the college in that row to all polling places, sorting the values, and choosing the smallest one. The distance returned is (I believe) measured in degrees Latitude/Longitude instead of miles or meters.

I have other functions grabbing the name, index, and geometry of the polling places. They're all contained in the '105_NearestPolling_01.ipynb' file. #17

Thank you in advance for any help figuring out the best way to approach this!

nickeubank commented 3 years ago

WOOT! Great work @jgy4 . You've definitely put together a solid implementation. Here's that reading I mentioned in person: https://geopandas.org/docs/user_guide/projections.html#setting-a-projection

More soon...