biglocalnews / warn-scraper

Command-line interface for downloading WARN Act notices of qualified plant closings and mass layoffs from state government websites
https://warn-scraper.readthedocs.io
Apache License 2.0
29 stars 10 forks source link

Job Center data quality script(s) #178

Open zstumgoren opened 3 years ago

zstumgoren commented 3 years ago

See the Job Center docs for background on the scraping strategy and issues described below.

After cutting over to use the Job Center site class for AZ, DE, KS and OK (#126), we should create one or more scripts that can be run on an automated schedule that:

  1. Check for records in each state that are missing Notice Date values (these records are not captured by the new date-based scraping strategy)
  2. Check for the addition of historical data from years prior to the hard-coded stop_year in each state's scrape function

An example of a record without a Notice Date is Eaton in Kansas:

chriszs commented 5 months ago

Possibly related #598