Open DiPierro opened 3 years ago
try_scraper.py does not discover the aliases of CivicPlus websites if the base url does not redirect to the true URL, but a quick online search can easily turn up the correct URL.
try_scraper.py
For example, http://ar-rogers.civicplus.com/AgendaCenter returns a 404 response with no redirect but a quick search shows the correct URL is https://www.rogersar.gov/AgendaCenter. There seems to be a similar problem with http://oh-greenecounty.civicplus.com/ failing to redirect to https://www.greenecountyohio.gov/AgendaCenter.
We should check whether other 404 responses follow the same pattern and manually edit our list of sites accordingly.
I've manually fixed or removed the URLs that return 404 responses. Here's an Rscript showing the steps I took. I have updated the public list of known CivicPlus sites with the changes documented in this script.
try_scraper.py
does not discover the aliases of CivicPlus websites if the base url does not redirect to the true URL, but a quick online search can easily turn up the correct URL.For example, http://ar-rogers.civicplus.com/AgendaCenter returns a 404 response with no redirect but a quick search shows the correct URL is https://www.rogersar.gov/AgendaCenter. There seems to be a similar problem with http://oh-greenecounty.civicplus.com/ failing to redirect to https://www.greenecountyohio.gov/AgendaCenter.
We should check whether other 404 responses follow the same pattern and manually edit our list of sites accordingly.