Closed palewire closed 1 year ago
As seen here
pipenv run python -m warn.cli sc -l DEBUG 2022-12-01 12:07:35,306 - warn.runner - Scraping sc 2022-12-01 12:07:35,306 - warn.utils - Requesting https://scworks.org/employer/employer-programs/at-risk-of-closing/layoff-notification-reports /home/runner/.local/share/virtualenvs/warn-github-flow-R1xICqqL/lib/python3.9/site-packages/urllib3/connectionpool.py:1045: InsecureRequestWarning: Unverified HTTPS request is being made to host 'scworks.org'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings warnings.warn( 2022-12-01 12:07:35,451 - warn.utils - Response code: 404 2022-12-01 12:07:35,4[51](https://github.com/biglocalnews/warn-github-flow/actions/runs/3592410692/jobs/6048090125#step:4:54) - warn.cache - Writing to cache data/warn-scraper/cache/sc/source.html Traceback (most recent call last): File "/opt/hostedtoolcache/Python/3.9.15/x64/lib/python3.9/runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "/opt/hostedtoolcache/Python/3.9.15/x64/lib/python3.9/runpy.py", line 87, in _run_code exec(code, run_globals) File "/home/runner/.local/share/virtualenvs/warn-github-flow-R1xICqqL/lib/python3.9/site-packages/warn/cli.py", line 79, in <module> main() File "/home/runner/.local/share/virtualenvs/warn-github-flow-R1xICqqL/lib/python3.9/site-packages/click/core.py", line 1130, in __call__ return self.main(*args, **kwargs) File "/home/runner/.local/share/virtualenvs/warn-github-flow-R1xICqqL/lib/python3.9/site-packages/click/core.py", line 1055, in main rv = self.invoke(ctx) File "/home/runner/.local/share/virtualenvs/warn-github-flow-R1xICqqL/lib/python3.9/site-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, **ctx.params) File "/home/runner/.local/share/virtualenvs/warn-github-flow-R1xICqqL/lib/python3.9/site-packages/click/core.py", line 760, in invoke return __callback(*args, **kwargs) File "/home/runner/.local/share/virtualenvs/warn-github-flow-R1xICqqL/lib/python3.9/site-packages/warn/cli.py", line 75, in main runner.scrape(scrape) File "/home/runner/.local/share/virtualenvs/warn-github-flow-R1xICqqL/lib/python3.9/site-packages/warn/runner.py", line [52](https://github.com/biglocalnews/warn-github-flow/actions/runs/3592410692/jobs/6048090125#step:4:55), in scrape data_path = state_mod.scrape(self.data_dir, self.cache_dir) File "/home/runner/.local/share/virtualenvs/warn-github-flow-R1xICqqL/lib/python3.9/site-packages/warn/scrapers/sc.py", line [56](https://github.com/biglocalnews/warn-github-flow/actions/runs/3592410692/jobs/6048090125#step:4:59), in scrape a_href = a["href"] File "/home/runner/.local/share/virtualenvs/warn-github-flow-R1xICqqL/lib/python3.9/site-packages/bs4/element.py", line 1519, in __getitem__ return self.attrs[key] KeyError: 'href' make: *** [Makefile:[71](https://github.com/biglocalnews/warn-github-flow/actions/runs/3592410692/jobs/6048090125#step:4:74): scrape] Error 1
Looks the URL we are pulling is now a 404
https://scworks.org/employer/employer-programs/at-risk-of-closing/layoff-notification-reports
As seen here