catalyst-cooperative / pudl-scrapers

Scrapers used to acquire snapshots of raw data inputs for versioned archiving and replicable analysis.
MIT License
3 stars 3 forks source link

Create scrapers for historical FERC VisualFoxPro data #49

Closed zaneselvans closed 1 year ago

zaneselvans commented 1 year ago

Since I've been poking around in the scrapers and it's fresh in my mind, I went ahead and created scrapers for all the old Visual FoxPro DBF databases.

I simplified the FERC 1 scraper to only download all the years, rather than allowing one year at a time, and used that pattern for all of the other scrapers as well.

I didn't add tests for all of these basically identical scrapers. Looking at the FERC 1 tests they seemed kind of perfunctory. I'm not sure what the right kinds of tests are for these things... I guess we want to make sure that they actually get all of the available files, and that they keep working with the data out there in the real world. Is that an integration test we should implement? Or is it the nature of the scrapers that we should just be running them for real on a regular basis, and when that usage breaks, we fix it?

Closes #41

codecov[bot] commented 1 year ago

Codecov Report

Merging #49 (237f6bb) into main (c2941f9) will decrease coverage by 1.2%. The diff coverage is 54.7%.

@@           Coverage Diff           @@
##            main     #49     +/-   ##
=======================================
- Coverage   64.6%   63.4%   -1.3%     
=======================================
  Files         15      18      +3     
  Lines        555     637     +82     
=======================================
+ Hits         359     404     +45     
- Misses       196     233     +37     
Impacted Files Coverage Δ
src/pudl_scrapers/spiders/ferc2.py 39.3% <39.3%> (ø)
src/pudl_scrapers/spiders/ferc6.py 56.5% <56.5%> (ø)
src/pudl_scrapers/spiders/ferc60.py 56.5% <56.5%> (ø)
src/pudl_scrapers/items.py 76.0% <76.9%> (+0.3%) :arrow_up:
src/pudl_scrapers/spiders/ferc1.py 73.9% <100.0%> (+10.2%) :arrow_up:

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.