US-EPA-CAMD / easey-ui

Project Management repo for EPA Clean Air Markets Division (CAMD) Business Suite of applications
MIT License
0 stars 0 forks source link

Emission Views SP Partition Speed Solution #6410

Open djw4erg opened 5 days ago

djw4erg commented 5 days ago

Background

ECMPS uses Emission View tables to store display ready versions of emissions data. The tables are needed because formatting the reported data for display takes too much time for a user expecting to see the data. Because the emissions data exists in both the Workspace and Official schemas two sets of the Emission View tables are needed. A set of stored procedures were created to populate the Workspace schema Emission View tables and those procedures were performant. A version of those SP were created to populate the Official schema Emission View tables, but those procedures where not performant.

I believe the initial understanding of the Official schema performance problem was primarily because of the much larger amount of data in the Official schema. Additionally and based on testing, the multiple joins on the single emission tables for different parameters appeared to exacerbate slowness. Because of the perceived join issue, the "pivot" method was developed, which used table functions to flatten the hourly rows for multiple parameters into single rows for each hour to require one join per table. Although the "pivot" method reduced the slowness issue, speed problems with some of the Official stored procedures still exist.

Additional research exposed some truths about querying the RPT_PERIOD_ID partitioned Official schema emission tables. Where and especially join conditions on the tables will result in searching each partition for a table if the RPT_PERIOD_ID is not included in the condition for the table. This was the original slowness problem and explains why the "pivot" method worked since it included the RPT_PERIOD_ID in the where clause it used.

Previous Tickets

This ticket is intended to replace previous tickets and will result in a QA of all Emission Views and will cover the testing for those tickets. As a result, both of the tickets below will be closed.

5793

6324

Requirement

Update the stored procedures that load the Emission View tables to use direct joins for each parameter to an emissions table. However each join to the tables must include the RPT_PERIOD_ID.

Additionally, after ensuring they are not used outside of Emission View population, drop the following table functions in both the CAMDECMPS and CAMDECMPSWKS schemas.

Stored Procedure List

Handled SP already include the adjustment to use the RPT_PERIOD_ID in any where or on clause that references a partitioned emissions table.

Stored Procedure Screen Name Official Workspace
REFRESH_EMISSION_VIEW_ALL Hourly Combined Parameters View :heavy_check_mark:
REFRESH_EMISSION_VIEW_CO2APPD Hourly CO2 Appendix D View
REFRESH_EMISSION_VIEW_CO2CALC CO2 Calculation
REFRESH_EMISSION_VIEW_CO2CEMS Hourly CO2 CEMS View
REFRESH_EMISSION_VIEW_CO2DAILYFUEL CO2 Daily Fuel Sampling View
REFRESH_EMISSION_VIEW_DAILYCAL Daily Calibration View :x:
REFRESH_EMISSION_VIEW_HIAPPD Hourly Heat Input Appendix D View
REFRESH_EMISSION_VIEW_HICEMS Hourly Heat Input CEMS View
REFRESH_EMISSION_VIEW_HIUNITSTACK Heat Input for Unit/Stack View
REFRESH_EMISSION_VIEW_LME LME View
REFRESH_EMISSION_VIEW_MASSOILCALC Mass Oil Calculation View
REFRESH_EMISSION_VIEW_MATSHCL MATS HCL View
REFRESH_EMISSION_VIEW_MATSHF MATS HF View
REFRESH_EMISSION_VIEW_MATSHG MATS HG View :x:
REFRESH_EMISSION_VIEW_MATSSO2 MATS SO2 View
REFRESH_EMISSION_VIEW_MATSSORBENT MATS Sorbent View :x:
REFRESH_EMISSION_VIEW_MATSWEEKLY MATS weekly View
REFRESH_EMISSION_VIEW_MOISTURE Moisture View
REFRESH_EMISSION_VIEW_NOXAPPEMIXEDFUEL Unit Level Fuel Curve View :x:
REFRESH_EMISSION_VIEW_NOXAPPESINGLEFUEL NOX Appendix E Individual Fuel Curve View
REFRESH_EMISSION_VIEW_NOXMASSCEMS Hourly NOx Mass CEMS View :x:
REFRESH_EMISSION_VIEW_NOXRATECEMS Hourly NOx Rate CEMS View
REFRESH_EMISSION_VIEW_OTHERDAILY Other Daily Tests View
REFRESH_EMISSION_VIEW_SO2APPD Hourly SO2 Appendix D View
REFRESH_EMISSION_VIEW_SO2CEMS Hourly SO2 CEMS View
djw4erg commented 1 day ago

Explain Plant Cost Results

The explain plan for the actual pivot version of the select used to populate the Emission View tables uses a default for the pivot function call that does not reflect the work the pivot functions actually do. So the a version of the select statement had to be created to get a more accurate explain plan cost for the pivot version.

Model's Stored Procedure Pivot Low Pivot High New Low New High
REFRESH_EMISSION_VIEW_ALL 10,822.33 138,045.30 8,327.44 32,209.00
REFRESH_EMISSION_VIEW_CO2APPD
REFRESH_EMISSION_VIEW_CO2CALC
REFRESH_EMISSION_VIEW_CO2CEMS 47,856.64 85,952.26 9,493.77 33,422.93
REFRESH_EMISSION_VIEW_CO2DAILYFUEL
REFRESH_EMISSION_VIEW_DAILYCAL
REFRESH_EMISSION_VIEW_HIAPPD
REFRESH_EMISSION_VIEW_HICEMS 9,497.71 33,444.67 5,526.20 5,544.28
REFRESH_EMISSION_VIEW_HIUNITSTACK
REFRESH_EMISSION_VIEW_LME
REFRESH_EMISSION_VIEW_MASSOILCALC
REFRESH_EMISSION_VIEW_MATSHCL
REFRESH_EMISSION_VIEW_MATSHF
REFRESH_EMISSION_VIEW_MATSHG
REFRESH_EMISSION_VIEW_MATSSO2
REFRESH_EMISSION_VIEW_MATSSORBENT
REFRESH_EMISSION_VIEW_MATSWEEKLY
REFRESH_EMISSION_VIEW_MOISTURE
REFRESH_EMISSION_VIEW_NOXAPPEMIXEDFUEL
REFRESH_EMISSION_VIEW_NOXAPPESINGLEFUEL
REFRESH_EMISSION_VIEW_NOXMASSCEMS
REFRESH_EMISSION_VIEW_NOXRATECEMS
REFRESH_EMISSION_VIEW_OTHERDAILY
REFRESH_EMISSION_VIEW_SO2APPD
REFRESH_EMISSION_VIEW_SO2CEMS
djw4erg commented 1 day ago

Test Emission Reports

Stored Procedure ORIS Facility Name Locations Quarter
REFRESH_EMISSION_VIEW_ALL 2712 Roxboro CS004A, 4A, 4B 2023 Q3

Monitoring Plan Ids

ORIS Facility Name Locations MP Id
2712 Roxboro CS004A, 4A, 4B MDC-4F0130B46ADB4B4F9C2EE0BA4A72D463

Reporting Period Ids

Quarter Id
2023 Q3 123