biglocalnews / warn-scraper

Command-line interface for downloading WARN Act notices of qualified plant closings and mass layoffs from state government websites
https://warn-scraper.readthedocs.io
Apache License 2.0
29 stars 10 forks source link

SSL verification bypass for Job Center sites #539

Closed stucka closed 1 year ago

stucka commented 1 year ago

It may be useful to be able to easily disable SSL certificate verification for some Job Center sites. But passing that variable along is more because functions call functions call functions call functions.

Looks like in /warn/platforms/job_center/utils.py should have def scrape_state( state_postal, search_url, output_csv, stop_year, cache_dir, use_cache=True, verify=True ):

but that calls on a bunch of stuff in /warn/platforms/job_center/site.py

line 41: def scrape(self, start_date=None, end_date=None, detail_pages=True, use_cache=True): lines 64+: kwargs = { "params": self._search_kwargs(start_date=start, end_date=end), "use_cache": use_cache, "detail_pages": detail_pages, }

82+: self._scrape_next_page( next_page_link, html_store, data, detail_pages, use_cache )

146+: def _scrape_next_page( self, next_page_link, html_store, data, detail_pages, use_cache ): """Scrape the results of next page and update the payload.""" kwargs = { "params": {}, "detail_pages": detail_pages, "use_cache": use_cache, }

https://github.com/biglocalnews/warn-scraper/issues/538

stucka commented 1 year ago

Built with https://github.com/biglocalnews/warn-scraper/compare/1.2.36...1.2.37