catalyst-cooperative / pudl

The Public Utility Data Liberation Project provides analysis-ready energy system data to climate advocates, researchers, policymakers, and journalists.
https://catalyst.coop/pudl
MIT License
468 stars 107 forks source link

Generate documentation for extracted XBRL tables #1629

Closed zschira closed 2 years ago

zschira commented 2 years ago

FERC's XBRL taxonomy contains quite a bit of metadata/descriptions, which can be used to generate a tabular data resource to document tables produced by XBRL extraction.

zaneselvans commented 2 years ago

If it's possible to automatically generate rich metadata for the new XBRL format that would be great! It's something we were never really able to do with the only semi-structured FoxPro database. Being able to export that information directly into our documentation would be very helpful. Our Pydantic metadata structures were originally modeled on the tabular data resources / packages. Are you thinking of generating outputs that could be consumed by our existing metadata infrastructure? Or actual datapackage.json descriptors?

zschira commented 2 years ago

I was planning on using the existing metadata infrastructure as much as possible. So far I've just been working on cleaning up table/column names and organizing metadata from the taxonomy to get prepped for actually generating the structures. Do you have thoughts on using the existing metadata models?

zschira commented 2 years ago

Datapackage generation has been implemented in the XBRL repo