catalyst-cooperative / pudl-scrapers

Scrapers used to acquire snapshots of raw data inputs for versioned archiving and replicable analysis.
MIT License
3 stars 3 forks source link

Use new https:// source for the EPA CEMS data #26

Closed zaneselvans closed 1 year ago

zaneselvans commented 1 year ago

EPA now appears to be providing (and preferring) https:// access to the FTP server that has historically been used to distribute the EPA CEMS hourly emissions data. This means we can get rid of the janky FTP specific logic in our epacems data downloading script and use standard requests infrastructure (which will hopefully be much faster and might be able to run in async mode?)

The new address is: https://gaftp.epa.gov/DMDnLoad/emissions/hourly/monthly/

zaneselvans commented 1 year ago

@zschira did you say this issue had been addressed in the scraper-archiver repo merge changes?

zschira commented 1 year ago

Oh yeah, forgot to add this to the sprint, but the new scraper/archiver is using the https source

zschira commented 1 year ago

Closing see the new archiver repo