-
After simplifying our test suite setup (issue #942) I ran the data validation tests, to make sure they still worked with the new setup. There were a few tables with more rows than expected because (I…
-
I was able to easily run the ETL using the `LocalExecutor`. `docker-compose up` creates two dask workers and runs the ETL using the `DaskExecutor`. This ran but got hung up on various parts of the ETL…
-
### Describe the bug
Running the `pudl_etl` script for the fast ETL results in a KeyError.
### Bug Severity
How badly is this bug affecting you?
- **High:** This bug is preventing me from using…
-
According to our contribution guideline [here](https://breakthrough-energy.github.io/docs/dev/contribution_guide.html#data-intake-procedure), we will need the data intake procedure for all raw input f…
-
A review of the resource metadata in `src/pudl/package_data/meta/datapkg/datapackage.json` reveals the following resources are missing primary keys, an attribute required by the new harvest process (#…
-
Based on our discussion in #1406, we have decided to prototype a Dagster version of our CEMS pipeline.
There are a number of features I'd like to test out:
- [Asset Lineage](https://docs.dagster.…
-
Integration of the EIA 923 data from 2001-2008 (see PR #1035) has added some new EIA plants and utilities that previously didn't appear, and so they are now showing up as unmapped in the PUDL plant / …
-
Once FERC Form 1 and EIA 923 are connected (#212), we can take the generation unit level marginal cost of fuel based on EIA data, and the non-fuel costs from FERC, and use them to estimate total MCOE …
-
Currently the `fuel_cost`, `hr_by_unit`, and `hr_by_gen` outputs from the MCOE process end up having about the same number of records regardless of whether the frequency of the outputs is annual or mo…
-
# Description
Change the PUDL data processing pipeline to write many of its outputs directly to a database, rather than a bundle of tabular data packages made up of CSV and JSON files. For cost-effec…