Fixes our Cuyahoga County Planning Commission spider (aka. cuya_planning), which was raising errors because the meeting location appears to have changed.
Why are we doing this?
We want working scrapers, of course 🤖 The changes in this PR include modifications to ensure the scraper functions without error.
Steps to manually test
After installing the project using pipenv:
Activate the virtual environment:
pipenv shell
Run the spider:
scrapy crawl cuya_planning -O test_output.csv
Monitor the stdout and ensure that the crawl proceeds without raising any errors. Pay attention to the final status report from scrapy.
Inspect test_output.csv to ensure the data looks valid. I suggest opening a few of the URLs under the source column of test_output.csv and comparing the data for the row with what you see on the page.
Are there any smells or added technical debt to note?
The new location validator is set to just use the street address of the county's office. It's possible that if this page uses variants of the office address – eg. just listing "Administrative headquarters" – we'll get more validation errors. For now, hopefully this change will suffice.
To see the specific tasks where the Asana app for GitHub is being used, see below:
What's this PR do?
Fixes our Cuyahoga County Planning Commission spider (aka.
cuya_planning
), which was raising errors because the meeting location appears to have changed.Why are we doing this?
We want working scrapers, of course 🤖 The changes in this PR include modifications to ensure the scraper functions without error.
Steps to manually test
After installing the project using
pipenv
:Activate the virtual environment:
Run the spider:
Monitor the stdout and ensure that the crawl proceeds without raising any errors. Pay attention to the final status report from scrapy.
Inspect
test_output.csv
to ensure the data looks valid. I suggest opening a few of the URLs under the source column of test_output.csv and comparing the data for the row with what you see on the page.Are there any smells or added technical debt to note?