Closed zschira closed 3 days ago
See review of https://github.com/catalyst-cooperative/pudl-archiver/pull/362 - I think we should consolidate the "figure out the filename" and "figure out what taxonomy each file points at" logic into the archiver, and just read all that information out of rssfeed
here.
Background
The
pudl-archiver
has been updated to decouple taxonomies from years of XBRL data. Now it puts all versions of taxonomies in a single zipfile, so if filings from the same year use different versions of the taxonomy, those can all be referenced. This PR updates the extractor to accommodate this change. Now, it will parse all taxonomies and create a dictionary which maps these parsed taxonomies to the version. Then, while parsing individual filings, it will detect the version of the taxonomy referenced in the filing, and use that version for interpreting the facts in the filing.