The Public Utility Data Liberation Project provides analysis-ready energy system data to climate advocates, researchers, policymakers, and journalists.
In order to ingest XBRL data into PUDL, we need a datastore that can interpret XBRL archives (#1593). The archives consist of a set of XBRL filings, and some metadata pulled from the RSS feed, and stored in a JSON file. The metadata provides a list of filings (with additional info like the date-time the filings was submitted) submitted by an individual filer for a specified year and period. This is required because filers are able to resubmit filings at any point in time, so there may be multiple filings for filer for a specific year/period, and PUDL must know which filing to use.
Design
The datastore will open the metadata file, and find the most recent filing for every filer/year/period combo. We will assume that the most recent filing is the best one to process. It will then read this files into in-memory buffers which will be passed to the XBRL extractor.
Background
In order to ingest XBRL data into PUDL, we need a datastore that can interpret XBRL archives (#1593). The archives consist of a set of XBRL filings, and some metadata pulled from the RSS feed, and stored in a JSON file. The metadata provides a list of filings (with additional info like the date-time the filings was submitted) submitted by an individual filer for a specified year and period. This is required because filers are able to resubmit filings at any point in time, so there may be multiple filings for filer for a specific year/period, and PUDL must know which filing to use.
Design
The datastore will open the metadata file, and find the most recent filing for every filer/year/period combo. We will assume that the most recent filing is the best one to process. It will then read this files into in-memory buffers which will be passed to the XBRL extractor.