biglocalnews / aw-scripts

Misc scripts and other code bits that haven't graduated to their own repos.
Other
0 stars 1 forks source link

Fix 404 responses manually where applicable. #4

Open DiPierro opened 3 years ago

DiPierro commented 3 years ago

try_scraper.py does not discover the aliases of CivicPlus websites if the base url does not redirect to the true URL, but a quick online search can easily turn up the correct URL.

For example, http://ar-rogers.civicplus.com/AgendaCenter returns a 404 response with no redirect but a quick search shows the correct URL is https://www.rogersar.gov/AgendaCenter. There seems to be a similar problem with http://oh-greenecounty.civicplus.com/ failing to redirect to https://www.greenecountyohio.gov/AgendaCenter.

We should check whether other 404 responses follow the same pattern and manually edit our list of sites accordingly.

DiPierro commented 3 years ago

I've manually fixed or removed the URLs that return 404 responses. Here's an Rscript showing the steps I took. I have updated the public list of known CivicPlus sites with the changes documented in this script.