City-Bureau / city-scrapers-cle

City Scrapers project for Cleveland
https://cityscrapers.org/
MIT License
15 stars 14 forks source link

🕷️ Fix spider: Cuyahoga County Technical Advisory Committee #72

Closed SimmonsRitchie closed 10 months ago

SimmonsRitchie commented 10 months ago

What's this PR do?

Fixes our Cuyahoga County Technical Advisory Committee spider (aka. cuya_technical_advisory_committee), which broke due to URL and page structure changes.

[Note: Builds on #71, which should be reviewed first]

Why are we doing this?

We want working scrapers, of course 🤖 The changes in this PR include URL and parser changes to the cuya_technical_advisory_committee spider .

Steps to manually test

After installing the project using pipenv (see Readme):

  1. Activate the virtual environment:

    pipenv shell
  2. Run the spider:

    scrapy crawl cuya_technical_advisory_committee -O test_output.csv
  3. Monitor the stdout and ensure that the crawl proceeds without raising any errors. Pay attention to the final status report from scrapy.

  4. Inspect test_output.csv to ensure the data looks valid. I suggest opening a few of the URLs under the source column of test_output.csv and comparing the data for the row with what you see on the page.

Are there any smells or added technical debt to note?

skorasaurus commented 10 months ago

Thanks so much for all of your work :100: :1st_place_medal:

SimmonsRitchie commented 10 months ago

Thank you, @skorasaurus! I appreciate the kind words.