catalyst-cooperative / pudl-scrapers

Scrapers used to acquire snapshots of raw data inputs for versioned archiving and replicable analysis.
MIT License
3 stars 3 forks source link

Archive FERC XBRL Taxonomies #42

Closed zaneselvans closed 2 years ago

zaneselvans commented 2 years ago

We currently archive the XBRL data published via FERC's RSS feeds, but rely on live versions of the XBRL taxonomies to parse that data. This is brittle, since if those live taxonomies go down, the archived XBRL data becomes impossible to parse. It also means that if the live taxonomy changes, older archived versions of the XBRL data may not be parseable with the new live taxonomy.

Update the scrapers to archive the taxonomies alongside the XBRL data which they pertain to, so that a given version of the taxonomy can be used to parse the data it was associated with at the time of archiving.