metoppv / improver

IMPROVER is a library of algorithms for meteorological post-processing.
http://improver.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
105 stars 87 forks source link

Reduce size of acceptance test data #1804

Closed tjtg closed 1 year ago

tjtg commented 2 years ago

During an in-person discussion, it was agreed that reducing the size of the acceptance test data is a useful target, as this will make maintenance of the dataset easier and make the acceptance tests quicker to run.

The intention is to reduce spatial dimensions to somewhere around 50x50 to 150x150, depending on the diagnostic and the plugin - smaller is better if possible, but some will need a larger spatial area to include realistic weather features in the data. Some data files contain probability thresholds, percentiles or ensemble members and these can also be thinned out where practical to reduce the data size.

There are also currently a few tests which are disabled as they take a very long time to run on the current dataset and these could be re-enabled as smaller input datasets will almost certainly take less time to run.

These changes are best done over a series of PRs, rather than one large PR. This series-of-PRs approach will allow incremental reductions in the data size, reduce the chance of merge conflicts and also assist with keeping the review size manageable. Most of the review effort will be looking at the data files, the code changes are likely to be small.

Acceptance criteria:

The synthetic data approach as described in #1218, #1274, #1275 and #1276 is still valuable, but will require significantly more resources to implement.

bayliffe commented 1 year ago

Test directory sizes at outset

The list below shows the initial state of the acceptance test data in terms of data volume. The list is provided in descending order of size so that work can start on the largest sets of test files first.

Acceptance test directory sizes

535M    ./wind_downscaling
236M    ./generate-realizations
153M    ./weighted_blending
102M    ./apply-night-mask
98M ./nbhood
97M ./wind_direction
83M ./threshold
70M ./time-lagged-ens
56M ./extract
53M ./wet-bulb-temperature
52M ./recursive-filter
52M ./orographic_enhancement
40M ./wet-bulb-temperature-integral
38M ./wxcode
31M ./combine
26M ./wind-gust-diagnostic
26M ./resolve-wind-components
25M ./phase-change-level
24M ./apply-lapse-rate
23M ./temporal-interpolate
21M ./nbhood-land-and-sea
21M ./generate-percentiles
20M ./nowcast-optical-flow-from-winds
19M ./regrid
19M ./apply-emos-coefficients
17M ./generate-topography-bands-weights
15M ./spot-extract
15M ./between-thresholds
14M ./nbhood-iterate-with-mask
14M ./cloud-top-temperature
12M ./feels_like_temp
11M ./nowcast-optical-flow
11M ./generate-orographic-smoothing-coefficients
11M ./estimate-emos-coefficients
9.2M    ./wxcode-modal
8.9M    ./neighbour-finding
8.4M    ./estimate-emos-coefficients-from-table
7.8M    ./nowcast-extrapolate
6.3M    ./standardise
5.9M    ./cloud-condensation-level
5.8M    ./nowcast-accumulate
5.8M    ./generate-topography-bands-mask
5.6M    ./interpolate-using-difference
5.6M    ./apply-rainforests-calibration
5.2M    ./blend-adjacent-points
4.7M    ./lightning-from-cape-and-precip
4.2M    ./hail-fraction
3.6M    ./apply-bias-correction
3.1M    ./uv-index
3.1M    ./phase-probability
2.6M    ./freezing-rain
2.4M    ./vertical-updraught
2.4M    ./field-texture
2.1M    ./relabel_to_period
1.8M    ./merge
1.7M    ./apply-reliability-calibration
1.4M    ./hail-size
1.3M    ./temp-lapse-rate
1.2M    ./generate-metadata-cube
1.2M    ./extend-radar-mask
1.1M    ./vicinity
1.1M    ./fill-radar-holes
1.1M    ./calculate-forecast-bias
908K    ./create-grid-with-halo
904K    ./construct-reliability-tables
780K    ./manipulate-reliability-table
656K    ./generate-clearsky-solar-radiation
644K    ./shower-condition-probability
644K    ./expected-value
528K    ./phase-mask
520K    ./aggregate-reliability-tables
516K    ./remake-as-shower-condition
516K    ./max-in-time-window
396K    ./generate-solar-time
392K    ./snow-fraction
392K    ./sleet_probability
392K    ./generate-landmask
392K    ./convection-ratio
392K    ./blend-cycles-and-realizations
260K    ./interpret_metadata
20K ./shower-condition

Approach

Where possible we will coarsen the resolution of data. This will maintain the geographic context, and an impression of the synoptic situation, in the test inputs and known good outputs. In cases where the grid scale is important, for example in neighbourhooding, recursive filter, or topographically aware processes, we will use a smaller domain to preserve the grid resolution whilst reducing the data volume.

bayliffe commented 1 year ago

Existing PR

https://github.com/metoppv/improver/pull/1805 is an existing PR which tackles:

the four sets of tests that comprise the largest amount of data. We are going to rework this slightly to coarsen the data rather than shrink the domain where possible.

Reworked

I've updated the PR following the recreation of much of this acceptance test data. The changes are:

The acceptance test data can be found in branch: shrink_atd_1 (under ppdev)

bayliffe commented 1 year ago

Notes on first tranche of changes

Size at outset: 2.1GB Expected size after these changes: 638MB

Tests considered in this first tranche

 'apply-bias-correction',
 'apply-night-mask',
 'between-thresholds',
 'blend-adjacent-points',
 'blend-cycles-and-realizations',
 'cloud-condensation-level',
 'cloud-top-temperature',
 'combine',
 'construct-reliability-tables',
 'convection-ratio',
 'estimate-emos-coefficients-from-table',
 'expected-value',
 'extend-radar-mask',
 'extract',
 'feels_like_temp',
 'freezing-rain',
 'generate-clearsky-solar-radiation',
 'generate-landmask',
 'generate-percentiles',
 'generate-realizations',
 'generate-solar-time',
 'hail-fraction',
 'hail-size',
 'interpret_metadata',
 'lightning-from-cape-and-precip',
 'max-in-time-window',
 'merge',
 'relabel_to_period',
 'remake-as-shower-condition',
 'resolve-wind-components',
 'shower-condition-probability',
 'standardise',
 'temporal-interpolate',
 'time-lagged-ens',
 'uv-index',
 'weighted_blending',
 'wet-bulb-temperature',
 'wet-bulb-temperature-integral',
 'wind-gust-diagnostic',
 'wind_direction'
 'wind_downscaling',

The following were not actually modified as there was no need to shrink them:

construct-reliability-tables
expected-value
interpret_metadata
generate-clearsky-solar-radiation
generate-solar-time
convection-ratio
blend-cycles-and-realizations

Extract CLI

The grid KGO are now tiny (UK 2x2, lat-lon 10x7). As this functionality is demonstrating extraction / subsetting rather than any kind of scientific processing I am happy with this.

test_generate_realizations.py - test_probabilities_reordering

The KGO for this had to be replaced. The new inputs have a different shape to the originals (thinned x/y). This means that the random numbers generated in the ensemble reordering are different, even though the random seed is fixed. As a result, at locations where there is a tied value (i.e. two identical temperatures for two different members) in the raw ensemble, and the tie needs to be resolved, we get a different result with the thinned data.

This figure shows the differences between the output generated by the test and the xy-thinned KGO. differences

test_generate_realizations.py - test_ecc_bounds_warning

I had to offset the xy-thinning in the x-direction (by 16 grid points) to ensure the new grid included a 300m/s value to exceed the ECC bounds.

standardise - radarnet

These tests involve nimrod file inputs. There need different tooling to resize, so I've left this for now.

construct reliability tables

The data for these tests is already small, so it has not been modified.

expected-value

Already small test data, left unchanged.

estimate-emos-coefficients-from-table

Includes parquet files that make up most of the data, so this needs individual attention.

weighted-blending

All the tests that use the spatial weights have been left unchanged. The next job will be to copy in Tom's cut out section changes for these tests.

bayliffe commented 1 year ago

List of PRs within tranche 1

PR Tests modified Branch Plotting
https://github.com/metoppv/improver/pull/1805 apply-night-mask, generate-realizations, weighted-blending, wind_downscaling shrink_atd_1 Plots
#1857 wind_direction, time-lagged-ens, extract, wet-bulb-temperature, wet-bulb-temperature-integral shrink_atd_2 Plots
#1858 combine, wind-gust-diagnostic, resolve-wind-components, temporal-interpolate, generate-percentiles shrink_atd_3 Plots
#1859 between-thresholds, cloud-condensation-level, cloud-top-temperature, feels_like_temp shrink_atd_4 Plots
#1860 standardise, blend-adjacent-points, lightning-from-cape-and-precip, hail-fraction, apply-bias-correction shrink_atd_5 Plots
#1861 uv-index, freezing-rain, relabel_to_period, merge, hail-size shrink_atd_6 Plots
#1862 extend-radar-mask, shower-condition-probability shrink_atd_7 Plots
#1863 remake-as-shower-condition, max-in-time-window, generate-landmask shrink_atd_8 Plots
bayliffe commented 1 year ago

Summary of Tranche 1 changes

The data volume for the current acceptance test data following the first tranche of changes is 638MB, as expected.

Test directory sizes

Acceptance test directory sizes after tranche 1

98M ./nbhood
83M ./threshold
52M ./recursive-filter
52M ./orographic_enhancement
38M ./wxcode
25M ./phase-change-level
24M ./apply-lapse-rate
21M ./nbhood-land-and-sea
20M ./nowcast-optical-flow-from-winds
19M ./regrid
19M ./apply-emos-coefficients
17M ./generate-topography-bands-weights
15M ./spot-extract
14M ./nbhood-iterate-with-mask
11M ./nowcast-optical-flow
11M ./generate-orographic-smoothing-coefficients
11M ./estimate-emos-coefficients
9.2M    ./wxcode-modal
8.9M    ./neighbour-finding
8.4M    ./estimate-emos-coefficients-from-table
7.8M    ./nowcast-extrapolate
5.8M    ./nowcast-accumulate
5.8M    ./generate-topography-bands-mask
5.6M    ./interpolate-using-difference
5.6M    ./apply-rainforests-calibration
5.0M    ./weighted_blending
3.8M    ./combine
3.1M    ./phase-probability
2.7M    ./wind_downscaling
2.6M    ./standardise
2.4M    ./vertical-updraught
2.4M    ./field-texture
2.2M    ./generate-realizations
1.8M    ./wet-bulb-temperature-integral
1.7M    ./wet-bulb-temperature
1.7M    ./extract
1.7M    ./apply-reliability-calibration
1.6M    ./apply-bias-correction
1.4M    ./time-lagged-ens
1.3M    ./temp-lapse-rate
1.2M    ./generate-metadata-cube
1.1M    ./vicinity
1.1M    ./temporal-interpolate
1.1M    ./hail-fraction
1.1M    ./generate-percentiles
1.1M    ./freezing-rain
1.1M    ./fill-radar-holes
1.1M    ./calculate-forecast-bias
1.1M    ./blend-adjacent-points
912K    ./apply-night-mask
908K    ./create-grid-with-halo
904K    ./construct-reliability-tables
780K    ./manipulate-reliability-table
780K    ./hail-size
780K    ./cloud-top-temperature
656K    ./generate-clearsky-solar-radiation
652K    ./cloud-condensation-level
648K    ./feels_like_temp
644K    ./expected-value
528K    ./phase-mask
524K    ./wind_direction
520K    ./wind-gust-diagnostic
520K    ./aggregate-reliability-tables
516K    ./merge
516K    ./lightning-from-cape-and-precip
396K    ./generate-solar-time
392K    ./snow-fraction
392K    ./sleet_probability
392K    ./resolve-wind-components
392K    ./extend-radar-mask
392K    ./convection-ratio
392K    ./blend-cycles-and-realizations
388K    ./shower-condition-probability
388K    ./max-in-time-window
388K    ./enforce-consistent-probabilities
264K    ./uv-index
264K    ./generate-landmask
260K    ./remake-as-shower-condition
260K    ./relabel_to_period
260K    ./interpret_metadata
260K    ./between-thresholds

Tranche 2

These are the remaining tests that will be tackled in tranche 2.

Tranche 2 tests for data reduction

 'aggregate-reliability-tables',
 'apply-emos-coefficients',
 'apply-lapse-rate',
 'apply-rainforests-calibration',
 'apply-reliability-calibration',
 'calculate-forecast-bias',
 'create-grid-with-halo',
 'enforce-consistent-probabilities',
 'estimate-emos-coefficients',
 'field-texture',
 'fill-radar-holes',
 'generate-metadata-cube',
 'generate-orographic-smoothing-coefficients',
 'generate-topography-bands-mask',
 'generate-topography-bands-weights',
 'interpolate-using-difference',
 'manipulate-reliability-table',
 'nbhood',
 'nbhood-iterate-with-mask',
 'nbhood-land-and-sea',
 'neighbour-finding',
 'nowcast-accumulate',
 'nowcast-extrapolate',
 'nowcast-optical-flow',
 'nowcast-optical-flow-from-winds',
 'orographic_enhancement',
 'phase-change-level',
 'phase-mask',
 'phase-probability',
 'recursive-filter',
 'regrid',
 'shower-condition',
 'sleet_probability',
 'snow-fraction',
 'spot-extract',
 'temp-lapse-rate',
 'threshold',
 'vertical-updraught',
 'vicinity',
 'wind_direction',
 'wind_downscaling',
 'wxcode',
 'wxcode-modal'

bayliffe commented 1 year ago

List of PRs within tranche 2

PR Tests modified Branch Plotting
https://github.com/metoppv/improver/pull/1872 neighbour-finding, spot-extract shrink_atd_9 Plots
https://github.com/metoppv/improver/pull/1874 wxcode, wxcode-modal, field-texture, apply-lapse-rate shrink_atd_10 Plots
https://github.com/metoppv/improver/pull/1875 vicinity, vertical-updraught, threshold shrink_atd_11 Plots
https://github.com/metoppv/improver/pull/1876 regrid, recursive-filter, interpolate-using-difference shrink_atd_12 Plots
https://github.com/metoppv/improver/pull/1877 phase-probability, phase-change-level shrink_atd_13 Plots
https://github.com/metoppv/improver/pull/1881 nowcast-accumulate, nowcast-extrapolate, nowcast-optical-flow-from-winds, nowcast-optical-flow, orographic_enhancement shrink_atd_14 Plots
https://github.com/metoppv/improver/pull/1882 nbhood, nbhood-land-and-sea, nbhood-iterate-with-mask shrink_atd_15 Plots
https://github.com/metoppv/improver/pull/1883 generate-topography-bands-mask, generate-topography-bands-weights, create-grid-with-halo, generate-orographic-smoothing-coefficients, fill-radar-holes shrink_atd_16 Plots
https://github.com/metoppv/improver/pull/1884 apply-reliability-calibration shrink_atd_17 Plots

CLIs for which tests did not require shrinking

Follow up work

https://github.com/metoppv/improver/pull/1874#discussion_r1118668036

I have not shrunk the EMOS related test data. The changes required are to reduce the numbers of realizations, percentiles, and thresholds in the various gridded tests. This changes the KGO in a way that means we cannot trivially demonstrate that the data is "fundamentally unchanged" by the process. These two sets of tests occupy around 30MB so we may need to come back to them.

bayliffe commented 1 year ago

All branches merged. This work now moves into the follow on ticket: https://github.com/metoppv/improver/issues/1849

Final size 138MB. This is not quite as small as originally intended as the EMOS tests were not shrunk due their complexity and time constraints. This could be revisited in future if so desired.