biglocalnews / warn-scraper

Command-line interface for downloading WARN Act notices of qualified plant closings and mass layoffs from state government websites
https://warn-scraper.readthedocs.io
Apache License 2.0
29 stars 10 forks source link

Fix GA: new scraper needed #519

Closed esagara closed 11 months ago

esagara commented 1 year ago

Georgia has changed how it is handling WARN notices and now has a new data portal that does not appear to be working yet. We will need to monitor and create a new scraper to handle this.

Ash1R commented 1 year ago

They also have archived notices, but only going back to 2018. here

Should we scrape these as well?

Ash1R commented 1 year ago

I tried scraping the new data portal, but the the url request returns an intermediate page with a button to "skip to content". I don't think that's the main problem, however. When loading the page in browser, the table takes a few seconds to load, so I'm guessing its not getting scraped either. Is there any way to solve this?

threalboss commented 1 year ago

I have been trying to scrape the GA new website with Selenium and it kinda was working, but was not able to fully gather all the data. But I also have a direct CSV file from the GA Rapid Response group with all the archived data.

https://www.dropbox.com/sh/05wzmaqhx8of5cj/AABjNnm6tuX_MQcK6Ow8XZYqa?dl=0

stucka commented 1 year ago

For the record, @Ash1R has a pull request here https://github.com/biglocalnews/warn-scraper/pull/530

stucka commented 11 months ago

Fixed with #530