During an in-person discussion, it was agreed that reducing the size of the acceptance test data is a useful target, as this will make maintenance of the dataset easier and make the acceptance tests quicker to run.

The intention is to reduce spatial dimensions to somewhere around 50x50 to 150x150, depending on the diagnostic and the plugin - smaller is better if possible, but some will need a larger spatial area to include realistic weather features in the data. Some data files contain probability thresholds, percentiles or ensemble members and these can also be thinned out where practical to reduce the data size.

There are also currently a few tests which are disabled as they take a very long time to run on the current dataset and these could be re-enabled as smaller input datasets will almost certainly take less time to run.

These changes are best done over a series of PRs, rather than one large PR. This series-of-PRs approach will allow incremental reductions in the data size, reduce the chance of merge conflicts and also assist with keeping the review size manageable. Most of the review effort will be looking at the data files, the code changes are likely to be small.

Acceptance criteria:

[ ] Data set size of current acceptance test data reduced to approx. 100 MB
[ ] Acceptance test code and checksums updated as necessary

The synthetic data approach as described in #1218, #1274, #1275 and #1276 is still valuable, but will require significantly more resources to implement.

Test directory sizes at outset

The list below shows the initial state of the acceptance test data in terms of data volume. The list is provided in descending order of size so that work can start on the largest sets of test files first.

Acceptance test directory sizes


535M    ./wind_downscaling
236M    ./generate-realizations
153M    ./weighted_blending
102M    ./apply-night-mask
98M ./nbhood
97M ./wind_direction
83M ./threshold
70M ./time-lagged-ens
56M ./extract
53M ./wet-bulb-temperature
52M ./recursive-filter
52M ./orographic_enhancement
40M ./wet-bulb-temperature-integral
38M ./wxcode
31M ./combine
26M ./wind-gust-diagnostic
26M ./resolve-wind-components
25M ./phase-change-level
24M ./apply-lapse-rate
23M ./temporal-interpolate
21M ./nbhood-land-and-sea
21M ./generate-percentiles
20M ./nowcast-optical-flow-from-winds
19M ./regrid
19M ./apply-emos-coefficients
17M ./generate-topography-bands-weights
15M ./spot-extract
15M ./between-thresholds
14M ./nbhood-iterate-with-mask
14M ./cloud-top-temperature
12M ./feels_like_temp
11M ./nowcast-optical-flow
11M ./generate-orographic-smoothing-coefficients
11M ./estimate-emos-coefficients
9.2M    ./wxcode-modal
8.9M    ./neighbour-finding
8.4M    ./estimate-emos-coefficients-from-table
7.8M    ./nowcast-extrapolate
6.3M    ./standardise
5.9M    ./cloud-condensation-level
5.8M    ./nowcast-accumulate
5.8M    ./generate-topography-bands-mask
5.6M    ./interpolate-using-difference
5.6M    ./apply-rainforests-calibration
5.2M    ./blend-adjacent-points
4.7M    ./lightning-from-cape-and-precip
4.2M    ./hail-fraction
3.6M    ./apply-bias-correction
3.1M    ./uv-index
3.1M    ./phase-probability
2.6M    ./freezing-rain
2.4M    ./vertical-updraught
2.4M    ./field-texture
2.1M    ./relabel_to_period
1.8M    ./merge
1.7M    ./apply-reliability-calibration
1.4M    ./hail-size
1.3M    ./temp-lapse-rate
1.2M    ./generate-metadata-cube
1.2M    ./extend-radar-mask
1.1M    ./vicinity
1.1M    ./fill-radar-holes
1.1M    ./calculate-forecast-bias
908K    ./create-grid-with-halo
904K    ./construct-reliability-tables
780K    ./manipulate-reliability-table
656K    ./generate-clearsky-solar-radiation
644K    ./shower-condition-probability
644K    ./expected-value
528K    ./phase-mask
520K    ./aggregate-reliability-tables
516K    ./remake-as-shower-condition
516K    ./max-in-time-window
396K    ./generate-solar-time
392K    ./snow-fraction
392K    ./sleet_probability
392K    ./generate-landmask
392K    ./convection-ratio
392K    ./blend-cycles-and-realizations
260K    ./interpret_metadata
20K ./shower-condition

Approach

Where possible we will coarsen the resolution of data. This will maintain the geographic context, and an impression of the synoptic situation, in the test inputs and known good outputs. In cases where the grid scale is important, for example in neighbourhooding, recursive filter, or topographically aware processes, we will use a smaller domain to preserve the grid resolution whilst reducing the data volume.

Existing PR

https://github.com/metoppv/improver/pull/1805 is an existing PR which tackles:

apply-night-mask
generate-realizations
weighted-blending
wind-downscaling

the four sets of tests that comprise the largest amount of data. We are going to rework this slightly to coarsen the data rather than shrink the domain where possible.

Reworked

I've updated the PR following the recreation of much of this acceptance test data. The changes are:

coarsen rather than cut out the data for apply-night-mask, generate-realizations, weighted-blending (where not using spatial weights).
reuse the cut out sections for wind-downscaling, and weighted-blending where spatial weights are used.

The acceptance test data can be found in branch: shrink_atd_1 (under ppdev)

Notes on first tranche of changes

Size at outset: 2.1GB Expected size after these changes: 638MB

Tests considered in this first tranche


 'apply-bias-correction',
 'apply-night-mask',
 'between-thresholds',
 'blend-adjacent-points',
 'blend-cycles-and-realizations',
 'cloud-condensation-level',
 'cloud-top-temperature',
 'combine',
 'construct-reliability-tables',
 'convection-ratio',
 'estimate-emos-coefficients-from-table',
 'expected-value',
 'extend-radar-mask',
 'extract',
 'feels_like_temp',
 'freezing-rain',
 'generate-clearsky-solar-radiation',
 'generate-landmask',
 'generate-percentiles',
 'generate-realizations',
 'generate-solar-time',
 'hail-fraction',
 'hail-size',
 'interpret_metadata',
 'lightning-from-cape-and-precip',
 'max-in-time-window',
 'merge',
 'relabel_to_period',
 'remake-as-shower-condition',
 'resolve-wind-components',
 'shower-condition-probability',
 'standardise',
 'temporal-interpolate',
 'time-lagged-ens',
 'uv-index',
 'weighted_blending',
 'wet-bulb-temperature',
 'wet-bulb-temperature-integral',
 'wind-gust-diagnostic',
 'wind_direction'
 'wind_downscaling',

The following were not actually modified as there was no need to shrink them:

construct-reliability-tables
expected-value
interpret_metadata
generate-clearsky-solar-radiation
generate-solar-time
convection-ratio
blend-cycles-and-realizations

Extract CLI

The grid KGO are now tiny (UK 2x2, lat-lon 10x7). As this functionality is demonstrating extraction / subsetting rather than any kind of scientific processing I am happy with this.

test_generate_realizations.py - test_probabilities_reordering

The KGO for this had to be replaced. The new inputs have a different shape to the originals (thinned x/y). This means that the random numbers generated in the ensemble reordering are different, even though the random seed is fixed. As a result, at locations where there is a tied value (i.e. two identical temperatures for two different members) in the raw ensemble, and the tie needs to be resolved, we get a different result with the thinned data.

This figure shows the differences between the output generated by the test and the xy-thinned KGO.

test_generate_realizations.py - test_ecc_bounds_warning

I had to offset the xy-thinning in the x-direction (by 16 grid points) to ensure the new grid included a 300m/s value to exceed the ECC bounds.

standardise - radarnet

These tests involve nimrod file inputs. There need different tooling to resize, so I've left this for now.

construct reliability tables

The data for these tests is already small, so it has not been modified.

expected-value

Already small test data, left unchanged.

estimate-emos-coefficients-from-table

Includes parquet files that make up most of the data, so this needs individual attention.

weighted-blending

All the tests that use the spatial weights have been left unchanged. The next job will be to copy in Tom's cut out section changes for these tests.

List of PRs within tranche 1

PR	Tests modified	Branch	Plotting
https://github.com/metoppv/improver/pull/1805	apply-night-mask, generate-realizations, weighted-blending, wind_downscaling	shrink_atd_1	Plots
#1857	wind_direction, time-lagged-ens, extract, wet-bulb-temperature, wet-bulb-temperature-integral	shrink_atd_2	Plots
#1858	combine, wind-gust-diagnostic, resolve-wind-components, temporal-interpolate, generate-percentiles	shrink_atd_3	Plots
#1859	between-thresholds, cloud-condensation-level, cloud-top-temperature, feels_like_temp	shrink_atd_4	Plots
#1860	standardise, blend-adjacent-points, lightning-from-cape-and-precip, hail-fraction, apply-bias-correction	shrink_atd_5	Plots
#1861	uv-index, freezing-rain, relabel_to_period, merge, hail-size	shrink_atd_6	Plots
#1862	extend-radar-mask, shower-condition-probability	shrink_atd_7	Plots
#1863	remake-as-shower-condition, max-in-time-window, generate-landmask	shrink_atd_8	Plots

Summary of Tranche 1 changes

The data volume for the current acceptance test data following the first tranche of changes is 638MB, as expected.

Test directory sizes

Acceptance test directory sizes after tranche 1


98M ./nbhood
83M ./threshold
52M ./recursive-filter
52M ./orographic_enhancement
38M ./wxcode
25M ./phase-change-level
24M ./apply-lapse-rate
21M ./nbhood-land-and-sea
20M ./nowcast-optical-flow-from-winds
19M ./regrid
19M ./apply-emos-coefficients
17M ./generate-topography-bands-weights
15M ./spot-extract
14M ./nbhood-iterate-with-mask
11M ./nowcast-optical-flow
11M ./generate-orographic-smoothing-coefficients
11M ./estimate-emos-coefficients
9.2M    ./wxcode-modal
8.9M    ./neighbour-finding
8.4M    ./estimate-emos-coefficients-from-table
7.8M    ./nowcast-extrapolate
5.8M    ./nowcast-accumulate
5.8M    ./generate-topography-bands-mask
5.6M    ./interpolate-using-difference
5.6M    ./apply-rainforests-calibration
5.0M    ./weighted_blending
3.8M    ./combine
3.1M    ./phase-probability
2.7M    ./wind_downscaling
2.6M    ./standardise
2.4M    ./vertical-updraught
2.4M    ./field-texture
2.2M    ./generate-realizations
1.8M    ./wet-bulb-temperature-integral
1.7M    ./wet-bulb-temperature
1.7M    ./extract
1.7M    ./apply-reliability-calibration
1.6M    ./apply-bias-correction
1.4M    ./time-lagged-ens
1.3M    ./temp-lapse-rate
1.2M    ./generate-metadata-cube
1.1M    ./vicinity
1.1M    ./temporal-interpolate
1.1M    ./hail-fraction
1.1M    ./generate-percentiles
1.1M    ./freezing-rain
1.1M    ./fill-radar-holes
1.1M    ./calculate-forecast-bias
1.1M    ./blend-adjacent-points
912K    ./apply-night-mask
908K    ./create-grid-with-halo
904K    ./construct-reliability-tables
780K    ./manipulate-reliability-table
780K    ./hail-size
780K    ./cloud-top-temperature
656K    ./generate-clearsky-solar-radiation
652K    ./cloud-condensation-level
648K    ./feels_like_temp
644K    ./expected-value
528K    ./phase-mask
524K    ./wind_direction
520K    ./wind-gust-diagnostic
520K    ./aggregate-reliability-tables
516K    ./merge
516K    ./lightning-from-cape-and-precip
396K    ./generate-solar-time
392K    ./snow-fraction
392K    ./sleet_probability
392K    ./resolve-wind-components
392K    ./extend-radar-mask
392K    ./convection-ratio
392K    ./blend-cycles-and-realizations
388K    ./shower-condition-probability
388K    ./max-in-time-window
388K    ./enforce-consistent-probabilities
264K    ./uv-index
264K    ./generate-landmask
260K    ./remake-as-shower-condition
260K    ./relabel_to_period
260K    ./interpret_metadata
260K    ./between-thresholds

Tranche 2

These are the remaining tests that will be tackled in tranche 2.

Tranche 2 tests for data reduction


 'aggregate-reliability-tables',
 'apply-emos-coefficients',
 'apply-lapse-rate',
 'apply-rainforests-calibration',
 'apply-reliability-calibration',
 'calculate-forecast-bias',
 'create-grid-with-halo',
 'enforce-consistent-probabilities',
 'estimate-emos-coefficients',
 'field-texture',
 'fill-radar-holes',
 'generate-metadata-cube',
 'generate-orographic-smoothing-coefficients',
 'generate-topography-bands-mask',
 'generate-topography-bands-weights',
 'interpolate-using-difference',
 'manipulate-reliability-table',
 'nbhood',
 'nbhood-iterate-with-mask',
 'nbhood-land-and-sea',
 'neighbour-finding',
 'nowcast-accumulate',
 'nowcast-extrapolate',
 'nowcast-optical-flow',
 'nowcast-optical-flow-from-winds',
 'orographic_enhancement',
 'phase-change-level',
 'phase-mask',
 'phase-probability',
 'recursive-filter',
 'regrid',
 'shower-condition',
 'sleet_probability',
 'snow-fraction',
 'spot-extract',
 'temp-lapse-rate',
 'threshold',
 'vertical-updraught',
 'vicinity',
 'wind_direction',
 'wind_downscaling',
 'wxcode',
 'wxcode-modal'

List of PRs within tranche 2

PR	Tests modified	Branch	Plotting
https://github.com/metoppv/improver/pull/1872	neighbour-finding, spot-extract	shrink_atd_9	Plots
https://github.com/metoppv/improver/pull/1874	wxcode, wxcode-modal, field-texture, apply-lapse-rate	shrink_atd_10	Plots
https://github.com/metoppv/improver/pull/1875	vicinity, vertical-updraught, threshold	shrink_atd_11	Plots
https://github.com/metoppv/improver/pull/1876	regrid, recursive-filter, interpolate-using-difference	shrink_atd_12	Plots
https://github.com/metoppv/improver/pull/1877	phase-probability, phase-change-level	shrink_atd_13	Plots
https://github.com/metoppv/improver/pull/1881	nowcast-accumulate, nowcast-extrapolate, nowcast-optical-flow-from-winds, nowcast-optical-flow, orographic_enhancement	shrink_atd_14	Plots
https://github.com/metoppv/improver/pull/1882	nbhood, nbhood-land-and-sea, nbhood-iterate-with-mask	shrink_atd_15	Plots
https://github.com/metoppv/improver/pull/1883	generate-topography-bands-mask, generate-topography-bands-weights, create-grid-with-halo, generate-orographic-smoothing-coefficients, fill-radar-holes	shrink_atd_16	Plots
https://github.com/metoppv/improver/pull/1884	apply-reliability-calibration	shrink_atd_17	Plots

CLIs for which tests did not require shrinking

wind-direction
temp-lapse-rate
snow-fraction
sleet-probability
phase-mask
generate-metadata-cube
enforce-consistent-probabilities

Follow up work

https://github.com/metoppv/improver/pull/1874#discussion_r1118668036

I have not shrunk the EMOS related test data. The changes required are to reduce the numbers of realizations, percentiles, and thresholds in the various gridded tests. This changes the KGO in a way that means we cannot trivially demonstrate that the data is "fundamentally unchanged" by the process. These two sets of tests occupy around 30MB so we may need to come back to them.

All branches merged. This work now moves into the follow on ticket: https://github.com/metoppv/improver/issues/1849

Final size 138MB. This is not quite as small as originally intended as the EMOS tests were not shrunk due their complexity and time constraints. This could be revisited in future if so desired.

metoppv / improver

Reduce size of acceptance test data #1804