Open krammy19 opened 3 years ago
Which URLs/sites are these? Is it this one: CA_city_websites_final.csv?
Hello, Could you please specify which URLs to check the status for?
Ok cool. Could you please specify the URLs to check the status? Is it one of the columns in this file 'CA_city_websites_final.csv'?
Sorry about the delay in responding! I'm talking about the urls returned by the html-request scraper.
I would encourage you to try running the scraper on your own to find any issues, but you can also find the output on this Google Sheet: https://docs.google.com/spreadsheets/d/11offSYz2irnjI-9tILkcI-ClclRUZ0pyhXtPy-G4i8g/edit?usp=sharing
All columns besides CITY and CITY_URL are what needs to be quality-checked.
Update: html-request scraper 2 has been renamed to AHP_parser
It would be good to verify that the urls that we're pulling in are actually valid with no errors.
Can someone please do a simple loop on the AHP_parser to request the sites and pull the status codes? If we're getting anything besides 200 codes, then we have some problems.