-
I think there might be an issue with the current harvesting process as applied to the 861 tables -- e.g. in the `sales_eia861` table, there's a `utility_id_eia`, and then there are some other columns …
-
A review of the resource metadata in `src/pudl/package_data/meta/datapkg/datapackage.json` reveals the following resources are missing primary keys, an attribute required by the new harvest process (#…
-
Several test / build failures appear when using pandas v1.3.0, e.g. #1056 (having to do with the way errors are reported in data package validation) and #1057 (some kind of pyarrow import error while …
-
I found a bunch of foreign key violations while compiling and standardizing the labeling tables (see #1252) , which isn't surprising, since we're linking up these coded columns with their home tables …
-
Based on our discussion in #1406, we have decided to prototype a Dagster version of our CEMS pipeline.
There are a number of features I'd like to test out:
- [Asset Lineage](https://docs.dagster.…
-
# Description
Change the PUDL data processing pipeline to write many of its outputs directly to a database, rather than a bundle of tabular data packages made up of CSV and JSON files. For cost-effec…
-
Right now, the ETL is configured to run a bundle that consists one or more datapackages that contain one or more datasets. This allows cross-dataset dependencies (e.g. epacems needs to load a specific…
-
The `generation_fuel_eia923` should have a natural primary key of:
* `report_date`
* `plant_id_eia`
* `fuel_type`
* `prime_mover_code`
* `nuclear_unit_id`
However, there are a bunch …
-
**Is your feature request related to a problem? Please describe.**
`$ pudl_datastore --help` lists a `--partitions` option but not how to use it, as it does not list the valid `KEY=VALUE` argument…
-
> Side note: there are **many** foreign key relationships which are not enumerated in the current version of the metadata, which we should absolutely add. They include but probably aren't limited to t…