Closed pulkkins closed 2 years ago
Merging #283 (495bf79) into master (fd2e665) will decrease coverage by
0.03%
. The diff coverage is87.17%
.
@@ Coverage Diff @@
## master #283 +/- ##
==========================================
- Coverage 82.63% 82.60% -0.04%
==========================================
Files 159 159
Lines 12186 12190 +4
==========================================
- Hits 10070 10069 -1
- Misses 2116 2121 +5
Flag | Coverage Δ | |
---|---|---|
unit_tests | 82.60% <87.17%> (-0.04%) |
:arrow_down: |
Flags with carried forward coverage won't be shown. Click here to find out more.
Impacted Files | Coverage Δ | |
---|---|---|
pysteps/cascade/interface.py | 92.30% <ø> (-0.55%) |
:arrow_down: |
pysteps/tests/test_nowcasts_steps.py | 100.00% <ø> (ø) |
|
pysteps/cascade/decomposition.py | 87.75% <55.55%> (-3.55%) |
:arrow_down: |
pysteps/cascade/bandpass_filters.py | 91.01% <96.66%> (-1.22%) |
:arrow_down: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update fd2e665...495bf79. Read the comment docs.
Thanks to @pulkkins for addressing this, and sorry it has taken awhile for me to have a look. I am afraid I still have a slight hickup with understanding the spatial scale value related to the central wavenumber of zero. Let us consider a 512 x 512 grid with 6 cascade levels. This new, updated filter_gaussian gives then the following central wavenumbers (WN):
So if the relationship between wavenumbers and spatial scale values is as I presume in Issue https://github.com/pySTEPS/pysteps/issues/276, then the corresponding spatial scale values are as shown above in column Scale for levels 1-5. Now if the scale ratio is constant between all levels, the spatial scale value for level 0 (WN = 0) should be 256. This equals the grid dimension (512) divided by two. But in Figure 1 in Pulkkinen et al. (2019) (https://gmd.copernicus.org/articles/12/4185/2019/) the spatial scale for WN=0 equals the grid dimension. But should it be grid dimension divided by two?
I am much more an engineer than a mathematician, but still it would be nice to understand how the wavenumbers and spatial scale values are related to each other. Any help is appreciated!
Thanks to @pulkkins for addressing this, and sorry it has taken awhile for me to have a look. I am afraid I still have a slight hickup with understanding the spatial scale value related to the central wavenumber of zero. Let us consider a 512 x 512 grid with 6 cascade levels. This new, updated filter_gaussian gives then the following central wavenumbers (WN):
So if the relationship between wavenumbers and spatial scale values is as I presume in Issue #276, then the corresponding spatial scale values are as shown above in column Scale for levels 1-5. Now if the scale ratio is constant between all levels, the spatial scale value for level 0 (WN = 0) should be 256. This equals the grid dimension (512) divided by two. But in Figure 1 in Pulkkinen et al. (2019) (https://gmd.copernicus.org/articles/12/4185/2019/) the spatial scale for WN=0 equals the grid dimension. But should it be grid dimension divided by two?
I am much more an engineer than a mathematician, but still it would be nice to understand how the wavenumbers and spatial scale values are related to each other. Any help is appreciated!
I'm using this relation between the Fourier wavenumbers and the corresponding spatial scales:
scale (km) = 0.5 * grid_size (pixels) / wavenumber * grid_resolution (km)
For the first scale (wavenumber is 0), this is not defined. Since the zeroth Fourier wavenumber corresponds to the field mean, my interpretation is that it should correspond to the grid dimension, not the dimension divided by two. The latter would correspond to wavenumber 1.
I tested another modification of the formula for l_0. By choosing
l_0 = (0.5 * l / (2 ** (n - 2))) ** (1 / (n - 1))
we get the following bandpass filters:
In this case, the ratio between each successive scale is constant and the first scale is the grid dimension. I also suggest replacing the Gaussian function of the first filter with an indicator function f such that f(0)=1 and f(x)=0 for x > 0. All other filters are set to zero for the 0-th wavenumber. This choice is justified because the 0-th wavenumber corresponds to the field mean, which in my opinion should be treated separately. Does this make sense?
Edit: If we make the above modification, the first cascade level should not be included at all in the decomposition. It's only purpose would be to represent the field mean. This can be subtracted before doing the decomposition.
Another option would be to completely omit the first filter centered at zero frequency. Following the approach of https://www.researchgate.net/publication/264943060_Stochastic_simulation_of_space-time_rainfall_patterns_for_the_Brisbane_river_catchment, I also modified the filter construction so that for each one we choose the midpoint of the corresponding interval as the central frequency. This results to the following filter configuration:
@dnerini and @tkokkone What do you think about this?
Thanks Seppo for the interesting discussion. I quite like your last option where you chose to omit the zeroth wavenumber, which is anyways undefined. Would this assume that the field is centered before the decomposition?
I also like this last alternative as we would not need to worry about the troublesome spatial scale related to the zeroth wavenumber but could simply use the relationship Seppo gave above. BTW, should the y-axis in the figure above be scaled so that the weights sum up to one? So it seems when comparing with Figure 1 in https://gmd.copernicus.org/articles/12/4185/2019/. But this could be just a matter of presentation.
Thanks Seppo for the interesting discussion. I quite like your last option where you chose to omit the zeroth wavenumber, which is anyways undefined. Would this assume that the field is centered before the decomposition?
Yes. I think I will do this by subtracting the mean value before the decomposition and then adding it back after recomposing the cascade.
I also like this last alternative as we would not need to worry about the troublesome spatial scale related to the zeroth wavenumber but could simply use the relationship Seppo gave above. BTW, should the y-axis in the figure above be scaled so that the weights sum up to one? So it seems when comparing with Figure 1 in https://gmd.copernicus.org/articles/12/4185/2019/. But this could be just a matter of presentation.
Here is the same figure with normalized weights:
I noticed that in ca1c5c4 you adjusted the accuracy thresholds to pass the tests. I'm not against this, but I'd like to understand better the cause. Should we be concerned by this drop of quality? Or is it mostly caused by a lower spread in predictions because of the new cascade decomposition?
I noticed that in ca1c5c4 you adjusted the accuracy thresholds to pass the tests. I'm not against this, but I'd like to understand better the cause. Should we be concerned by this drop of quality? Or is it mostly caused by a lower spread in predictions because of the new cascade decomposition?
Apart from those three cases, where I modified the thresholds, the CRPS was in fact slightly better with the new filter configuration. Anyway, we need to do a careful evaluation of any changes in the forecast skill before merging this pull request.
I did some verification experiments with the old and new bandpass filters. Here are the results with the FMI data in the repository (nowcast start times between 16:00-16:30 UTC on September 28th 2016). The models were run with their default configurations (STEPS with velocity perturbations disabled).
It seems that the new filters give a slightly better forecast skill. However, as noted above, the skill with the new filters can also be slightly lower in some cases. Should we merge this pull request?
Deterministic metrics for S-PROG nowcasts:
CRPS for STEPS nowcasts:
This looks great, thanks Seppo for the analysis! Yes, all good on my side, you can merge into master. Great work!
It was pointed out in Issue #276 by that when using the cascade decomposition with the default parameters, the ratio between successive scales is not constant. There was a gap between the first scale and the remaining ones. However, this can be simply fixed by setting
l_0 = (0.5 * l) ** (1 / (n - 1))
So, I made this the new default value in
cascade.bandpass_filters.filter_gaussian
.