Closed stucka closed 6 months ago
Have a patch to get the URL to works, but the code stack does not support .xlsx files. Need to either maybe specify an older version of xlrd that does, or to switch to openpyxl or something else.
openpyxl is in the requirements and other scrapers appear to use it.
ky.py.txt Incremental backup here as I can't commit to a branch without passing all the tests
Historical data has been normalized at https://storage.googleapis.com/bln-data-public/warn-layoffs/ky-historical-normalized.csv
Historical data has been copied from Kentucky's site and archived at https://storage.googleapis.com/bln-data-public/warn-layoffs/ky-original-1998-2016.xlsx
Jupyter Notebook for normalizing the data is here -- saved in a non-public project, so here for the record. ky-history-er.ipynb.txt
Latest draft of scraper archived here. Hasn't passed tests yet.
Closed with https://github.com/biglocalnews/warn-scraper/commit/44d6c2c89d4e8c9929d0104ef2ebb83452742c0e apparently
Kentucky's scraper relies on an Excel snapshot that's years out of date now.
Looks like the newer stuff is kept here: https://kcc.ky.gov/Pages/News.aspx
simple approach should look something like