catalyst-cooperative / ferc-xbrl-extractor

A tool for converting FERC filings published in XBRL into SQLite databases
MIT License
11 stars 0 forks source link

Ferc xbrl updates #233

Closed zschira closed 3 days ago

zschira commented 1 week ago

Background

The pudl-archiver has been updated to decouple taxonomies from years of XBRL data. Now it puts all versions of taxonomies in a single zipfile, so if filings from the same year use different versions of the taxonomy, those can all be referenced. This PR updates the extractor to accommodate this change. Now, it will parse all taxonomies and create a dictionary which maps these parsed taxonomies to the version. Then, while parsing individual filings, it will detect the version of the taxonomy referenced in the filing, and use that version for interpreting the facts in the filing.

jdangerx commented 1 week ago

See review of https://github.com/catalyst-cooperative/pudl-archiver/pull/362 - I think we should consolidate the "figure out the filename" and "figure out what taxonomy each file points at" logic into the archiver, and just read all that information out of rssfeed here.