Open jwhitehead77 opened 8 months ago
CVP will not work on this ticket. DPC will complete reamaining work
Jason, is it possible to load the Emission View data from Workspace into Official when emission reports are submitted instead of reproducing the values? This approach would need to account for the following:
The second issue would eventually require improving the efficiency of the Official process, but the need might not be as urgent.
[x] refresh_emission_view_noxappemixedfuel
[x] refresh_emission_view_noxappesinglefuel
[x] refresh_emission_view_noxmasscems
need to discuss with @ergjustin
Related Tickets:
Address monitor_hrly_value slowness: https://github.com/US-EPA-CAMD/easey-ui/issues/6174
As far as I can tell the pivot function method would not use indexes to join the DHV and MHV data. This might not cause an execution hit, but worries me. The following method achieves a similar result as the pivot function method, but should join using indexes.
View Emissions SO2 CEMS Query
select cor.Oris_Code,
cor.Facility_Name,
cor.Locations,
/* Real Columns */
cor.Mon_Plan_Id,
cor.Mon_Loc_Id,
cor.Rpt_Period_Id,
cor.Begin_Date as Date,
cor.Begin_Hour as Hour,
cor.Op_Time,
cor.Hr_Load as Unit_Load,
cor.Load_Uom_Cd as Load_Uom,
cor.So2c_Mhv_Unadjusted_Hrly_Value as Unadj_So2,
cor.So2c_Mhv_Applicable_Bias_Adj_Factor as So2_Baf,
cor.So2c_Mhv_Adjusted_Hrly_Value as Rpt_Adj_So2,
cor.So2c_Mhv_Calc_Adjusted_Hrly_Value as Calc_Adj_So2,
cor.So2c_Mhv_Modc_Cd as So2_Modc,
cor.So2c_Mhv_Pct_Available as So2_Pma,
cor.Flow_Mhv_Unadjusted_Hrly_Value as Unadj_Flow,
cor.Flow_Mhv_Applicable_Bias_Adj_Factor as Flow_Baf,
cor.Flow_Mhv_Adjusted_Hrly_Value as Rpt_Adj_Flow,
cor.Flow_Mhv_Calc_Adjusted_Hrly_Value as Adj_Flow_Used,
cor.Flow_Mhv_Modc_Cd as Flow_Modc,
cor.Flow_Mhv_Pct_Available as Flow_Pma,
cor.So2_Dhv_Calc_Pct_Moisture as Pct_Wrong_Used,
case
when cor.So2_Dhv_Calc_Pct_Moisture is null then null
when cor.h2o_dhv_Modc_Cd is not null then cor.h2o_dhv_Modc_Cd
when cor.h2o_mhv_Modc_Cd is not null then cor.h2o_mhv_Modc_Cd
else 'DF'
end as Source_H2o_Value,
frm.Equation_Cd as So2_Formula_Code,
cor.So2_Dhv_Adjusted_Hrly_Value as Rpt_So2_Mass_Rate,
cor.So2_Dhv_Calc_Adjusted_Hrly_Value as Calc_So2_Mass_Rate,
cor.Hi_Dhv_Calc_Adjusted_Hrly_Value,
cor.H2o_Dhv_Modc_Cd,
cor.So2r_Dhv_Adjusted_Hrly_Value,
cor.H2o_Mhv_Modc_Cd,
case (frm.Equation_Cd)
when 'F-23' then cor.Hi_Dhv_Calc_Adjusted_Hrly_Value
end as Calc_Hi_Rate,
case (frm.Equation_Cd)
when 'F-23' then
coalesce
(
cor.So2r_Dhv_Adjusted_Hrly_Value,
(
select max( Default_Value )
from camdecmps.MONITOR_DEFAULT def
where def.Mon_Loc_Id = cor.Mon_Loc_Id
and def.Parameter_Cd = 'SO2R'
and def.Default_Purpose_Cd = 'F23'
and ( def.Begin_Date + interval '1' HOUR * def.Begin_Hour ) <= ( cor.Begin_Date + interval '1' HOUR * cor.Begin_Hour )
and ( ( def.End_Date is null) or ( def.End_Date + interval '1' HOUR * def.End_Hour ) <= ( cor.Begin_Date + interval '1' HOUR * cor.Begin_Hour ) )
)
)
end as Default_So2_Emission_Rate
from (
select sel.Oris_Code,
sel.Facility_Name,
sel.Locations,
sel.Mon_Plan_Id,
sel.Rpt_Period_Id,
sel.Mon_Loc_Id,
sel.Begin_Date,
sel.Begin_Hour,
sel.Op_Time,
sel.Hr_Load,
sel.Load_Uom_Cd,
max( case when dhv.Parameter_Cd = 'SO2' then dhv.Hour_Id end ) as So2_Dhv_Hour_Id,
max( case when dhv.Parameter_Cd = 'SO2' then dhv.Calc_Pct_Moisture end ) as So2_Dhv_Calc_Pct_Moisture,
max( case when dhv.Parameter_Cd = 'SO2' then dhv.Adjusted_Hrly_Value end ) as So2_Dhv_Adjusted_Hrly_Value,
max( case when dhv.Parameter_Cd = 'SO2' then dhv.Calc_Adjusted_Hrly_Value end ) as So2_Dhv_Calc_Adjusted_Hrly_Value,
max( case when dhv.Parameter_Cd = 'SO2' then dhv.Mon_Form_Id end ) as So2_Dhv_Mon_Form_Id,
max( case when dhv.Parameter_Cd = 'HI' then dhv.Calc_Adjusted_Hrly_Value end ) as Hi_Dhv_Calc_Adjusted_Hrly_Value,
max( case when dhv.Parameter_Cd = 'H2O' then dhv.Modc_Cd end ) as H2o_Dhv_Modc_Cd,
max( case when dhv.Parameter_Cd = 'H2O' then dhv.Pct_Available end ) as H2o_Dhv_Pct_Available,
max( case when dhv.Parameter_Cd = 'SO2R' then dhv.Adjusted_Hrly_Value end ) as So2r_Dhv_Adjusted_Hrly_Value,
max( case when mhv.Parameter_Cd = 'FLOW' then mhv.Hour_Id end ) as Flow_Mhv_Hour_Id,
max( case when mhv.Parameter_Cd = 'FLOW' then mhv.Unadjusted_Hrly_Value end ) as Flow_Mhv_Unadjusted_Hrly_Value,
max( case when mhv.Parameter_Cd = 'FLOW' then mhv.Applicable_Bias_Adj_Factor end ) as Flow_Mhv_Applicable_Bias_Adj_Factor,
max( case when mhv.Parameter_Cd = 'FLOW' then mhv.Adjusted_Hrly_Value end ) as Flow_Mhv_Adjusted_Hrly_Value,
max( case when mhv.Parameter_Cd = 'FLOW' then mhv.Calc_Adjusted_Hrly_Value end ) as Flow_Mhv_Calc_Adjusted_Hrly_Value,
max( case when mhv.Parameter_Cd = 'FLOW' then mhv.Modc_Cd end ) as Flow_Mhv_Modc_Cd,
max( case when mhv.Parameter_Cd = 'FLOW' then mhv.Pct_Available end ) as Flow_Mhv_Pct_Available,
max( case when mhv.Parameter_Cd = 'SO2C' then mhv.Unadjusted_Hrly_Value end ) as So2c_Mhv_Unadjusted_Hrly_Value,
max( case when mhv.Parameter_Cd = 'SO2C' then mhv.Applicable_Bias_Adj_Factor end ) as So2c_Mhv_Applicable_Bias_Adj_Factor,
max( case when mhv.Parameter_Cd = 'SO2C' then mhv.Adjusted_Hrly_Value end ) as So2c_Mhv_Adjusted_Hrly_Value,
max( case when mhv.Parameter_Cd = 'SO2C' then mhv.Calc_Adjusted_Hrly_Value end ) as So2c_Mhv_Calc_Adjusted_Hrly_Value,
max( case when mhv.Parameter_Cd = 'SO2C' then mhv.Modc_Cd end ) as So2c_Mhv_Modc_Cd,
max( case when mhv.Parameter_Cd = 'SO2C' then mhv.Pct_Available end ) as So2c_Mhv_Pct_Available,
max( case when mhv.Parameter_Cd = 'H2O' then mhv.Modc_Cd end ) as H2o_Mhv_Modc_Cd
from (
select fac.Oris_Code,
fac.Facility_Name,
(
select string_agg( coalesce( unt.Unitid, stp.Stack_Name ), ', ' order by case when unt.Unit_Id is null then 0 else 1 end, coalesce( unt.Unitid, stp.Stack_Name ) )
from camdecmps.MONITOR_PLAN_LOCATION mpl
join camdecmps.MONITOR_LOCATION loc on loc.Mon_Loc_Id = mpl.Mon_Loc_Id
left join camd.UNIT unt on unt.Unit_Id = loc.Unit_Id
left join camdecmps.STACK_PIPE stp on stp.Stack_Pipe_Id = loc.Stack_Pipe_Id
where mpl.Mon_Plan_Id = pln.Mon_Plan_Id
) as Locations,
hod.Begin_Date,
hod.Begin_Hour,
hod.Op_Time,
hod.Hr_Load,
hod.Load_Uom_Cd,
ems.Mon_Plan_Id,
ems.Rpt_Period_Id,
mpl.Mon_Loc_Id,
hod.Hour_Id
from camdecmps.EMISSION_EVALUATION ems
join camdecmps.MONITOR_PLAN pln on pln.Mon_Plan_Id = ems.Mon_Plan_Id
join camd.PLANT fac on fac.Fac_Id = pln.Fac_Id
join camdecmps.MONITOR_PLAN_LOCATION mpl on mpl.Mon_Plan_Id = ems.Mon_Plan_Id
join camdecmps.HRLY_OP_DATA hod on hod.Mon_Loc_Id = mpl.Mon_Loc_Id and hod.Rpt_Period_Id = ems.Rpt_Period_Id
where ems.Mon_Plan_Id = 'TWCORNEL5-488E42008B434177BC7D7BFF138D18EF'
and ems.Rpt_Period_Id = 109
and pln.End_Rpt_Period_Id is null
) sel
join camdecmps.DERIVED_HRLY_VALUE dhv on dhv.Hour_Id = sel.Hour_Id and dhv.Parameter_Cd in ( 'SO2', 'HI', 'H2O', 'SO2R' )
join camdecmps.MONITOR_HRLY_VALUE mhv on mhv.Hour_Id = sel.Hour_Id and mhv.Parameter_Cd in ( 'FLOW', 'SO2C', 'H20' )
group
by sel.Oris_Code,
sel.Facility_Name,
sel.Locations,
sel.Mon_Plan_Id,
sel.Rpt_Period_Id,
sel.Mon_Loc_Id,
sel.Begin_Date,
sel.Begin_Hour,
sel.Op_Time,
sel.Hr_Load,
sel.Load_Uom_Cd
) cor
left join camdecmps.MONITOR_FORMULA frm on frm.Mon_Form_Id = cor.So2_Dhv_Mon_Form_Id
where cor.So2_Dhv_Hour_Id is not null
and cor.Flow_Mhv_Hour_Id is not null
All involved emissions tables have been analyzed on dev and the refresh procedures are running much faster. Code has been checked for all updated procedures.
@djw4erg to provide scripts which can be used to first clear data from affected views prior to testing.
select distinct
hod.mon_plan_id,
hod.mon_loc_id,
hod.rpt_period_id,
camdecmps.format_date_hour( hod.begin_date, hod.begin_hour, null ) AS date_hour,
hod.op_time,
hod.hr_load as unit_load,
hod.load_uom_cd as load_uom,
noxc.Unadjusted_Hrly_Value as unadjj_nox,
noxc.Applicable_bias_adj_factor as calc_nox_baf,
noxc.adjusted_hrly_value as rpt_adj_nox,
noxc.calc_adjusted_hrly_value as rpt_adj_nox,
noxc.modc_cd as nox_modc,
noxc.pct_available as nox_p0ma,
flow.unadjusted_hrly_value as unadj_flow,
flow.applicable_bias_adj_factor as calc_flow_baf,
flow.adjusted_hrly_value as rpt_adj_flow,
flow.calc_adjusted_hrly_value as adj_flow_used,
flow.modc_cd as flow_modc,
flow.pct_available as flow_pma,
nox.calc_pct_moisture as pct_h2o_used,
case
when ( noxr.calc_pct_moisture is not null ) then
case
when ( h2o_m.modc_cd is not null ) then h2o_m.modc_cd
when ( h2o_d.modc_cd is not null ) then h2o_d.modc_cd
else 'DF'
end
else null
end as source_h2o_value,
frm.equation_cd as noxmass_formula_code,
nox.adjusted_hrly_value as rpt_nox_mass,
nox.calc_adjusted_hrly_value as calc_nox_Mass
from temp_hourly_test_errors as hod
join camdecmps.DERIVED_HRLY_VALUE nox
on nox.hour_id = hod.hour_id
and nox.parameter_cd = 'NOX'
join camdecmps.MONITOR_HRLY_VALUE flow
on flow.hour_id = hod.hour_id
and flow.parameter_cd = 'FLOW'
join camdecmps.MONITOR_FORMULA frm
on frm.mon_form_id = nox.mon_form_id
and frm.equation_cd like 'F-26%'
left join camdecmps.MONITOR_HRLY_VALUE noxc
on noxc.hour_id = hod.hour_id
and noxc.parameter_cd = 'NOXC'
left join camdecmps.MONITOR_HRLY_VALUE h2o_m
on h2o_m.hour_id = hod.hour_id
and h2o_m.parameter_cd = 'H2O'
left join camdecmps.DERIVED_HRLY_VALUE h2o_d
on h2o_d.hour_id = hod.hour_id
and h2o_d.parameter_cd = 'H2O'
left join camdecmps.DERIVED_HRLY_VALUE noxr
on noxr.hour_id = hod.hour_id
and noxr.parameter_cd = 'NOXR'
The emissions view refresh procedures ported from the MS SQL client and used in the workspace do not perform well when used in the official schema due to the vast amount of data. Need to refactor the refresh procedures to be more performant overall.
There are 3 functions that pivot the derived_hrly_value, monitor_hrly_value, & hrly_param_fuel_flow tables so that the values needed for each parameter are a single row for each hour reported.
NOTE: The above functions are dynamic functions that return a dynamic list of columns based on the parameter codes passed into the function from the caller. If you add a column to be returned then anything that uses it must be updated to expected that column.
Instead of joining multiple times on derived_hrly_value, monitor_hrly_value, and/or hrly_param_fuel_flow which has 100+ million records based on different parameters. Replace all these joins with one join to the appropriate pivot functions above passing in the parameters that data is needed for.
The following have been refactored to use the pivot approach and can be used as examples...
The following had previously been refactored to use an approach that does not always work well for every case and needs to be refactored to use the pivot approach...
The following are the original ported procedures that have not been refactored at all and needs to be refactored to use the pivot approach...
The following cannot use the pivot approach but need to be looked at from a performance perspective...
NOTE: Refactor the camdecmps schema (official) procedures and test in the TEST or CDC database which has all of the production data and then copy them to the camdecmpswks schema (workspace) replacing the referenced schema from camdecmps to camdecmpswks
The following procedures are not persisted views and instead uses an actual view for the data so these refresh procedures are just refreshing the counts and should not need refactoring...