As an IMPROVER scientist I want to be able to trial IMPROVER developments over a historic period using archived input data, and verify their performance using a standard set of metrics.
Anticipated workflow:
Check out standard IMPROVER suite [of which this ticket will create the first version]
Add new plugins/options you want to test
Run trial over relevant historic period
Examine verification results
Revise components and iterate as required
Where appropriate, merge validated changes back to standard suite
The operational system will ultimately be based on a real-time mode of the same suite(s)
See Jonathan Flowerdew’s internal home page for the roadmap of how this relates to other verification development tasks, and the overall Verification Implementation Plan. This user story covers verification of gridded forecasts only; verification of spot forecasts will be added in a few months’ time.
Acceptance criteria:
Combine baseline historic post-processing suite (#68), Ensemble Copula Coupling (#95) and Ric Crocker’s sample verification suite (mi-as726), with support from the authors as required.
Each deterministic output, set of ensemble members, set of percentiles, or set of probabilities to exceed thresholds, from each stage of the post-processing (including the level 1 data retrieved from MASS), becomes a ‘stream’ for verification.
Provide optional facilities to read and verify corresponding level 0 data from MASS (optional so that once you’ve done it once for a given verification configuration you can just reuse the ARDs). A similar option will be added later for relevant final gridded outputs of the old post-processing system.
Define a convenient standard way to attach ECC plugins to all probability/percentile outputs to convert them back to ensemble members for verification (in a future real-time version of the suite, these should be arranged to not delay the time-critical post-processing).
Agree how stream metadata such as names, types, ensemble sizes and order should be passed to verification components.
Agree whether this can be one suite combining post-processing and verification, or has to be two coupled suites which are run in parallel.
Consider iterating the verification components only through the full lead-time range of the final IMPROVER forecast.
Collate brief user documentation describing each key feature, in an agreed location (Confluence?)
Debug complete package and run initial trial of at least one month (February 2017 unless otherwise agreed).
Use Roger Harbord’s plotting suite to check verification results have been successfully produced and do not indicate any obvious problems (though detailed scientific analysis and experimentation will be done separately later).
Record trial throughput, identify the main costly/slow steps, and eliminate unnecessary delays where possible.
Incorporate any necessary features to ensure use of HPC, MASS, etc remains fair.
Standard technical reviews, plus JF review of overall suitability for purpose.
Standard configuration running on HPC?
Bonus: Example spot forecast outputs (to pave the way for spot forecast verification).
(Ultimately, Nigel and others probably need configurable saving of sample output for case study analysis, but should we postpone this for now?).
As an IMPROVER scientist I want to be able to trial IMPROVER developments over a historic period using archived input data, and verify their performance using a standard set of metrics.
Anticipated workflow:
See Jonathan Flowerdew’s internal home page for the roadmap of how this relates to other verification development tasks, and the overall Verification Implementation Plan. This user story covers verification of gridded forecasts only; verification of spot forecasts will be added in a few months’ time.
Acceptance criteria: