Dynamically determine the URL of the ParkServe dataset

zigouras commented 3 months ago

Describe the task

The feature layer defined in the park_priority.py code pulls the GIS shape file data from a URL hard-coded as https://parkserve.tpl.org/downloads/Parkserve_Shapefiles_05212024.zip. Since this URL has a date in it, presumably it will change when new data is published.

The link to the most recent data set is listed on this page: https://www.tpl.org/park-data-downloads. We should write code to scrape this web page to get the link to the shape file dynamically so we always get the most recent one. The link is defined on the web page thusly: <a href="https://parkserve.tpl.org/downloads/Parkserve_Shapefiles_05212024.zip">Shapefile</a>

Acceptance Criteria

A Python function is written to scrape the web site to get the URL to the latest Shape file and that URL is used in the current park priority layer code. Unit tests to assert such functionality.

Additional context

The file is currently about 1.5 GB in size so takes a while to download. We only need some of the files included in that zip. If you can update the code so only the slices of the zip file that we need are downloaded, thus reducing bandwidth and download times, that would be a nice additional enhancement.

elizzakai commented 2 months ago

just to clarify, we're web scraping for this link within the website ? (picture of the link below) and then optimizing to only get specific sections of the zip since you. guys dont need everything?

zigouras commented 2 months ago

@elizzakai Yes web scraping the link and using the link you fetched in place of the hard-coded url in park_priority.py.

The part about only downloading parts of the zip file is a secondary bonus feature. I am not sure how to do it or even if it can be done but if you want to take a crack, feel free. The scraping is more important though.

Do you want me to assign this to you?

elizzakai commented 2 months ago

yeah go ahead

elizzakai commented 1 month ago

(hey i'll probably need some more time, not sure if I'll get to it yet this weekend)

zigouras commented 1 month ago

(hey i'll probably need some more time, not sure if I'll get to it yet this weekend)

That's fine. You will need to start with setting up the backend on your machine. We need more people to test that anyway so in and of itself that will be useful.

nlebovits commented 4 days ago

closed by #901

CodeForPhilly / clean-and-green-philly