catalyst-cooperative / pudl

The Public Utility Data Liberation Project provides analysis-ready energy system data to climate advocates, researchers, policymakers, and journalists.
https://catalyst.coop/pudl
MIT License
469 stars 108 forks source link

Transform `f1_purchased_power` xbrl + dbf #1820

Closed aesharpe closed 1 year ago

aesharpe commented 2 years ago

conversions from old transform to new

added

missing plant/util ids

settings files

cmgosnell commented 1 year ago

current task:

Here is my little check-er: Setup

import pudl
from pudl.transform.ferc1 import *

ferc1_settings = pudl.settings.Ferc1Settings(tables=["purchased_power_ferc1"])
# Extract FERC form 1
ferc1_dbf_raw_dfs = pudl.extract.ferc1.extract_dbf(
    ferc1_settings=ferc1_settings, pudl_settings=pudl_settings
)
# Extract FERC form 1 XBRL data
ferc1_xbrl_raw_dfs = pudl.extract.ferc1.extract_xbrl(
    ferc1_settings=ferc1_settings, pudl_settings=pudl_settings
)
pp = PurchasedPowerTableTransformer(cache_dfs=True, clear_cached_dfs=False)
start = pp.transform_start(
    raw_dbf=ferc1_dbf_raw_dfs["purchased_power_ferc1"],
    raw_xbrl_instant=ferc1_xbrl_raw_dfs["purchased_power_ferc1"]["instant"],
    raw_xbrl_duration=ferc1_xbrl_raw_dfs["purchased_power_ferc1"]["duration"]
)

Bad data finder:

num_cols = ["billing_demand_mw", "non_coincident_peak_demand_mw", "coincident_peak_demand_mw"]
col = num_cols[0]
out_col =pd.to_numeric(start[col], errors="coerce")
start.loc[out_col[out_col.isnull()].index][col].unique()
zaneselvans commented 1 year ago

Closed by PR #2011