Closed JohnHalleyGotway closed 1 month ago
John Wagner, via met-help, indicated that this feature would also be useful for NOAA/MDL in their use of Series-Analysis: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=95583
This is MetOffice deliverable due November 2024.
This issue was discussed during the Met Office NGVER meeting on July 24, 2024.
The functionality needed here is similar to how the Gen-Vx-Mask tool works. When gen_vx_mask is given its own output as input, it initializes values using the previously defined mask.
For Series-Analysis, the logic needed is described below:
-input
command line option.-input
file for relevant inputs.Working on feature_1371_series_analysis
branch. Added -input
command line argument to define the output from previous Series-Analysis runs. Also added support for "ALL"
being specified for the CTC, MCTC, PCT, and SL1L2 line types.
Still need to work on reading data from the -input
file to aggregate prior results with the current data.
@KathrynNewman advised that the -input
command line argument name is confusing. Will switch to using -aggregate
instead to be consistent with the Stat-Analysis -job aggregate
terminology.
TODO: MET #1371
unit_series_analysis.xml
already included a FILE_LIST
test that processing probability of precip with PCT over a time series of length 7. I split this into 2 tests: one run with 4 time steps and one -aggr
run with the remaining 3.put_nc_val
write each individual grid point value to the NetCDF file separately. Is that too slow? Should we update std::map<ConcatString, NcVarData> stat_data;
so NcVarData
includes a DataPlane
? So we'd update the contents of the DataPlane
during processing and then write the final values once at the end. Would that be faster?
-aggr
file does not contain the variables need for the aggregation logic:
DEBUG 2: Reading aggregation "series_cnt_TOTAL(*,*)" field.
DEBUG 2: Reading aggregation "series_ctc_TOTAL_gtOCDP25(*,*)" field.
ERROR :
ERROR : read_aggr_data_plane() -> Required variable "series_ctc_TOTAL_gtOCDP25(*,*)" not found in the aggregate file!
ERROR :
ERROR : Recommend recreating "series_analysis_out/series_analysis_STEP1.nc" to request that "ALL" CTC columns be written.
ERROR :
ANOM_CORR
in the CNT line type is being aggregated correctly using the SL1L2 and SAL1L2 partial sums.
compute_cntinfo()
to derive statistics from an SL1L2Info object containing both SL1L2 and SAL1L2 partial sums. In the past, the processing of anomalies was controlled via a boolean flag. But the updated implementation is cleaner and simpler.BSS
is computed using both a forecast PCT table and a climo PCT table. Series-Analysis writes the forecast PCT counts but not the climo PCT counts. So those climo PCT's cannot be aggregated directly. Perhaps BRIERCL
can be aggregated (as a weighted average?) and used during the computation of BSS?
do_climo_brier()
function to aggregate the climo brier score as a weighted average of the old/new ones and then recompute the BSS. I tested to confirm that this aggregation logic works as expected.unit_climatology_1.0DEG.xml
file includes 2 calls to Series-Analysis, one with deterministic GFS data and a second with probabilistic SREF data. Those calls should be broken down into 2 pieces to demonstrate aggregating SAL1L2 and PCT statistics with climo data.Series-Analysis is definitely a tool that would benefit from being parallelized. While I'm focussing on the enhancements described in this issue, we should use a separate issue/feature branch to optimize it.
Describe the New Feature
This is a feature that was requested by the UK Met Office via met-help: https://rt.rap.ucar.edu/rt/Ticket/Display.html?id=95578
They would like to be able to create gridded statistics over a longer time period that they can hold their model output and analyses on disk. To enable this, we'd need to enhance Series-Analysis to read it own output to aggregate stats over a longer time period.
After discussing details with the MetOffice on July 24, 2024, we decided to handle this as described below:
-aggr
) command line option to provide output from a previous run of Series-Analysis.-aggr
option is provided, read the previously generated counts and partial sums. Prior to computing the output statistics, aggregate the previously generated counts and partial sums with the newly generated ones. Compute the output statistics from those aggregated values.Some details...
Acceptance Testing
List input data types and sources. Describe tests required for new functionality.
Time Estimate
3 days?
Sub-Issues
Consider breaking the new feature down into sub-issues. None needed.
Relevant Deadlines
List relevant project deadlines here or state NONE.
Funding Source
Split between MetOffice (2799991) and NOAA (2792543) and 2783544
Define the Metadata
Assignee
Labels
Projects and Milestone
Define Related Issue(s)
Consider the impact to the other METplus components.
New Feature Checklist
See the METplus Workflow for details.
feature_<Issue Number>_<Description>
feature <Issue Number> <Description>