Closed aesharpe closed 1 year ago
to_numeric
data cols.Here is my little check-er: Setup
import pudl
from pudl.transform.ferc1 import *
ferc1_settings = pudl.settings.Ferc1Settings(tables=["purchased_power_ferc1"])
# Extract FERC form 1
ferc1_dbf_raw_dfs = pudl.extract.ferc1.extract_dbf(
ferc1_settings=ferc1_settings, pudl_settings=pudl_settings
)
# Extract FERC form 1 XBRL data
ferc1_xbrl_raw_dfs = pudl.extract.ferc1.extract_xbrl(
ferc1_settings=ferc1_settings, pudl_settings=pudl_settings
)
pp = PurchasedPowerTableTransformer(cache_dfs=True, clear_cached_dfs=False)
start = pp.transform_start(
raw_dbf=ferc1_dbf_raw_dfs["purchased_power_ferc1"],
raw_xbrl_instant=ferc1_xbrl_raw_dfs["purchased_power_ferc1"]["instant"],
raw_xbrl_duration=ferc1_xbrl_raw_dfs["purchased_power_ferc1"]["duration"]
)
Bad data finder:
num_cols = ["billing_demand_mw", "non_coincident_peak_demand_mw", "coincident_peak_demand_mw"]
col = num_cols[0]
out_col =pd.to_numeric(start[col], errors="coerce")
start.loc[out_col[out_col.isnull()].index][col].unique()
Closed by PR #2011
conversions from old transform to new
Ferc1AbstractTableTransformer.assign_record_id
deployed inFerc1AbstractTableTransformer.process_dbf
andFerc1AbstractTableTransformer.process_xbrl
AbstractTableTransformer.enforce_schema
deployed inFerc1AbstractTableTransformer.transform_end
rename_cols
assign_record_id
.strip_non_numeric_values
maybe?transform_main
added
record_id
col... need to go investigate that. maybe make arecord_id
nullable: bool
param?drop_duplicates
intotransform_main
missing plant/util ids
pytest test/integration/glue_test.py --live-dbs --save-unmapped-ids
(not needed!/tests all pass)settings files