pySTEPS / pysteps

Python framework for short-term ensemble prediction systems.
https://pysteps.github.io/
BSD 3-Clause "New" or "Revised" License
466 stars 168 forks source link

Cascade scale fix #283

Closed pulkkins closed 2 years ago

pulkkins commented 2 years ago

It was pointed out in Issue #276 by that when using the cascade decomposition with the default parameters, the ratio between successive scales is not constant. There was a gap between the first scale and the remaining ones. However, this can be simply fixed by setting

l_0 = (0.5 * l) ** (1 / (n - 1))

So, I made this the new default value in cascade.bandpass_filters.filter_gaussian.

codecov[bot] commented 2 years ago

Codecov Report

Merging #283 (495bf79) into master (fd2e665) will decrease coverage by 0.03%. The diff coverage is 87.17%.

@@            Coverage Diff             @@
##           master     #283      +/-   ##
==========================================
- Coverage   82.63%   82.60%   -0.04%     
==========================================
  Files         159      159              
  Lines       12186    12190       +4     
==========================================
- Hits        10070    10069       -1     
- Misses       2116     2121       +5     
Flag Coverage Δ
unit_tests 82.60% <87.17%> (-0.04%) :arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pysteps/cascade/interface.py 92.30% <ø> (-0.55%) :arrow_down:
pysteps/tests/test_nowcasts_steps.py 100.00% <ø> (ø)
pysteps/cascade/decomposition.py 87.75% <55.55%> (-3.55%) :arrow_down:
pysteps/cascade/bandpass_filters.py 91.01% <96.66%> (-1.22%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update fd2e665...495bf79. Read the comment docs.

tkokkone commented 2 years ago

Thanks to @pulkkins for addressing this, and sorry it has taken awhile for me to have a look. I am afraid I still have a slight hickup with understanding the spatial scale value related to the central wavenumber of zero. Let us consider a 512 x 512 grid with 6 cascade levels. This new, updated filter_gaussian gives then the following central wavenumbers (WN):

image

So if the relationship between wavenumbers and spatial scale values is as I presume in Issue https://github.com/pySTEPS/pysteps/issues/276, then the corresponding spatial scale values are as shown above in column Scale for levels 1-5. Now if the scale ratio is constant between all levels, the spatial scale value for level 0 (WN = 0) should be 256. This equals the grid dimension (512) divided by two. But in Figure 1 in Pulkkinen et al. (2019) (https://gmd.copernicus.org/articles/12/4185/2019/) the spatial scale for WN=0 equals the grid dimension. But should it be grid dimension divided by two?

I am much more an engineer than a mathematician, but still it would be nice to understand how the wavenumbers and spatial scale values are related to each other. Any help is appreciated!

pulkkins commented 2 years ago

Thanks to @pulkkins for addressing this, and sorry it has taken awhile for me to have a look. I am afraid I still have a slight hickup with understanding the spatial scale value related to the central wavenumber of zero. Let us consider a 512 x 512 grid with 6 cascade levels. This new, updated filter_gaussian gives then the following central wavenumbers (WN):

image

So if the relationship between wavenumbers and spatial scale values is as I presume in Issue #276, then the corresponding spatial scale values are as shown above in column Scale for levels 1-5. Now if the scale ratio is constant between all levels, the spatial scale value for level 0 (WN = 0) should be 256. This equals the grid dimension (512) divided by two. But in Figure 1 in Pulkkinen et al. (2019) (https://gmd.copernicus.org/articles/12/4185/2019/) the spatial scale for WN=0 equals the grid dimension. But should it be grid dimension divided by two?

I am much more an engineer than a mathematician, but still it would be nice to understand how the wavenumbers and spatial scale values are related to each other. Any help is appreciated!

I'm using this relation between the Fourier wavenumbers and the corresponding spatial scales:

scale (km) = 0.5 * grid_size (pixels) / wavenumber * grid_resolution (km)

For the first scale (wavenumber is 0), this is not defined. Since the zeroth Fourier wavenumber corresponds to the field mean, my interpretation is that it should correspond to the grid dimension, not the dimension divided by two. The latter would correspond to wavenumber 1.

pulkkins commented 2 years ago

I tested another modification of the formula for l_0. By choosing l_0 = (0.5 * l / (2 ** (n - 2))) ** (1 / (n - 1)) we get the following bandpass filters: bandpass_filter_weights_fmi

In this case, the ratio between each successive scale is constant and the first scale is the grid dimension. I also suggest replacing the Gaussian function of the first filter with an indicator function f such that f(0)=1 and f(x)=0 for x > 0. All other filters are set to zero for the 0-th wavenumber. This choice is justified because the 0-th wavenumber corresponds to the field mean, which in my opinion should be treated separately. Does this make sense?

Edit: If we make the above modification, the first cascade level should not be included at all in the decomposition. It's only purpose would be to represent the field mean. This can be subtracted before doing the decomposition.

pulkkins commented 2 years ago

Another option would be to completely omit the first filter centered at zero frequency. Following the approach of https://www.researchgate.net/publication/264943060_Stochastic_simulation_of_space-time_rainfall_patterns_for_the_Brisbane_river_catchment, I also modified the filter construction so that for each one we choose the midpoint of the corresponding interval as the central frequency. This results to the following filter configuration: bandpass_filter_weights_fmi

@dnerini and @tkokkone What do you think about this?

dnerini commented 2 years ago

Thanks Seppo for the interesting discussion. I quite like your last option where you chose to omit the zeroth wavenumber, which is anyways undefined. Would this assume that the field is centered before the decomposition?

tkokkone commented 2 years ago

I also like this last alternative as we would not need to worry about the troublesome spatial scale related to the zeroth wavenumber but could simply use the relationship Seppo gave above. BTW, should the y-axis in the figure above be scaled so that the weights sum up to one? So it seems when comparing with Figure 1 in https://gmd.copernicus.org/articles/12/4185/2019/. But this could be just a matter of presentation.

pulkkins commented 2 years ago

Thanks Seppo for the interesting discussion. I quite like your last option where you chose to omit the zeroth wavenumber, which is anyways undefined. Would this assume that the field is centered before the decomposition?

Yes. I think I will do this by subtracting the mean value before the decomposition and then adding it back after recomposing the cascade.

pulkkins commented 2 years ago

I also like this last alternative as we would not need to worry about the troublesome spatial scale related to the zeroth wavenumber but could simply use the relationship Seppo gave above. BTW, should the y-axis in the figure above be scaled so that the weights sum up to one? So it seems when comparing with Figure 1 in https://gmd.copernicus.org/articles/12/4185/2019/. But this could be just a matter of presentation.

Here is the same figure with normalized weights: bandpass_filter_weights_fmi

dnerini commented 2 years ago

I noticed that in ca1c5c4 you adjusted the accuracy thresholds to pass the tests. I'm not against this, but I'd like to understand better the cause. Should we be concerned by this drop of quality? Or is it mostly caused by a lower spread in predictions because of the new cascade decomposition?

pulkkins commented 2 years ago

I noticed that in ca1c5c4 you adjusted the accuracy thresholds to pass the tests. I'm not against this, but I'd like to understand better the cause. Should we be concerned by this drop of quality? Or is it mostly caused by a lower spread in predictions because of the new cascade decomposition?

Apart from those three cases, where I modified the thresholds, the CRPS was in fact slightly better with the new filter configuration. Anyway, we need to do a careful evaluation of any changes in the forecast skill before merging this pull request.

pulkkins commented 2 years ago

I did some verification experiments with the old and new bandpass filters. Here are the results with the FMI data in the repository (nowcast start times between 16:00-16:30 UTC on September 28th 2016). The models were run with their default configurations (STEPS with velocity perturbations disabled).

It seems that the new filters give a slightly better forecast skill. However, as noted above, the skill with the new filters can also be slightly lower in some cases. Should we merge this pull request?

Deterministic metrics for S-PROG nowcasts: ETS_0 1 ETS_1 0 ETS_5 0 MAE

CRPS for STEPS nowcasts: CRPS

dnerini commented 2 years ago

This looks great, thanks Seppo for the analysis! Yes, all good on my side, you can merge into master. Great work!