pySTEPS / pysteps

Python framework for short-term ensemble prediction systems.
https://pysteps.github.io/
BSD 3-Clause "New" or "Revised" License
466 stars 168 forks source link

Steps blending v1 #255

Closed dnerini closed 2 years ago

dnerini commented 2 years ago

This PR needs to implement the changes needed to make the blending module compatible with v1 (i.e. without xarray).

dnerini commented 2 years ago

In order to merge into master v1 there are few pending issues that we need to solve:

1. Remove xarray hard dependency

I already started addressing this with b947b6b but some work remains. In particular, the reprojection module is based on xarray and needs to be migrated to numpy. Alternatively, we could make xarray an optional dependency. I also suggest to remove the bom importers based on xarray. @ladc @RubenImhoff @cvelascof

2. NWP Importers

NWP importers should be separated from the base package as plugins, see #218. This could also a way to address the issue above, since the xarray dependency could be moved to the external plugin. For now, we suggest we do the exercise with the KNMI NWP importer and use that one for testing and examples of blending in the base package. What do you think @RubenImhoff?

RubenImhoff commented 2 years ago

I'll have a look at it (in the coming days) too!

codecov[bot] commented 2 years ago

Codecov Report

Merging #255 (3a2cac0) into master (72f3bf5) will increase coverage by 1.29%. The diff coverage is 90.15%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #255      +/-   ##
==========================================
+ Coverage   80.92%   82.22%   +1.29%     
==========================================
  Files         142      158      +16     
  Lines       10758    12088    +1330     
==========================================
+ Hits         8706     9939    +1233     
- Misses       2052     2149      +97     
Flag Coverage Δ
unit_tests 82.22% <90.15%> (+1.29%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pysteps/decorators.py 99.14% <ø> (ø)
pysteps/tests/helpers.py 94.28% <ø> (ø)
pysteps/tests/test_datasets.py 72.09% <ø> (ø)
pysteps/tests/test_nowcasts_steps.py 100.00% <ø> (ø)
pysteps/tests/test_plugins_support.py 100.00% <ø> (ø)
pysteps/tests/test_verification_detcatscores.py 100.00% <ø> (ø)
pysteps/tests/test_interfaces.py 96.62% <28.57%> (-3.38%) :arrow_down:
pysteps/blending/interface.py 50.00% <50.00%> (ø)
pysteps/blending/linear_blending.py 80.76% <80.76%> (ø)
pysteps/blending/steps.py 83.89% <83.89%> (ø)
... and 36 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 72f3bf5...3a2cac0. Read the comment docs.

RubenImhoff commented 2 years ago

@ladc, I see that there is an examples/test_decomposition_NWP.py. Do we want to keep this example? It could be useful, I think, but perhaps we should add some more doc strings and printed output in it.

RubenImhoff commented 2 years ago

@dnerini, how should we move forward with the external nwp importer plugins? Do you want to use the cookiecutter implementation or make a separate plugin that basically uses the current nwp_importers.py implementation?

RubenImhoff commented 2 years ago

I changed the reprojection script to work without xarray. It (for now) assumes that nwp_importers returns an array with the rainfall fields and a separate metadata dictionary (similar to io.importers.py). So the reprojection tests should pass if these importers return that.

dnerini commented 2 years ago

hould we move forward with the external nwp importer plugins? Do you want to use the cookiecutter implementation or make a separate plugin that basically uses the current nwp_importers.py implementation?

yes I think we should give the cookicutter plugin from @aperezhortal a try to implement the nwp importers! No idea if this will work ... can we start with the KNMI importer?

RubenImhoff commented 2 years ago

The cookiecutter is a nice tool, @aperezhortal! It took some time to get familiar with it, but I have all three NWP importers in a separate plugin now. A couple of questions to move forward (and let you review it):

aperezhortal commented 2 years ago

Hi @RubenImhoff, you wrote the first-ever known plugin :muscle:

About your questions:

Should I upload the plugin as a separate repository in pySTEPS? Then you could review the PR there.

That is good idea. You can perhaps upload it to an "initial_commit" branch and then submit a PR to the main branch with an initial commit that is just a README. I will be happy to take a look at it.

How to link the plugin to pysteps (locally it is easy with the installer, but does this also directly work in a repository)?

Do you mean to link the plugin to pysteps repository's workflow to load the NWP test data? If so, I can think of two ways of doing that. One is to add a new step to the CI workflow that installs the plugins after pysteps is installed. This seems like the simpler approach to me. For free, we are also testing the integration of this plugin.
The other option is to avoid any explicit dependency on the plugin and create fixtures for tests in formats that are easy to read (like numpy of h5). The advantage here is that the CI workflow does not depend on the plugin to run the test, which is also good since the latter is not an explicit dependency of the blending module. This will perhaps help to keep things simple. What do you think ?

For the tests in the plugin, how to link the pysteps-data-master to it (similar to how that is done in pysteps)?

If only a few files are needed, one option is to create a small script that downloads those files before running the tests.

RubenImhoff commented 2 years ago

Thanks for your quick response, @aperezhortal! I'll make a PR of the plugin - then we can also discuss further there.

You can perhaps upload it to an "initial_commit" branch and then submit a PR to the main branch with an initial commit that is just a README.

What do you mean with the initial commit just containing a README? You mean just putting the README in there without the rest of the files?

Do you mean to link the plugin to pysteps repository's workflow to load the NWP test data? If so, I can think of two ways of doing that. One is to add a new step to the CI workflow that installs the plugins after pysteps is installed. This seems like the simpler approach to me. For free, we are also testing the integration of this plugin. The other option is to avoid any explicit dependency on the plugin and create fixtures for tests in formats that are easy to read (like numpy of h5). The advantage here is that the CI workflow does not depend on the plugin to run the test, which is also good since the latter is not an explicit dependency of the blending module. This will perhaps help to keep things simple. What do you think ?

The first option is easy with the examples script, but I believe no other tests (except for the previous nwp_importers) depend on the presence of the importers. As the examples gallery is being decoupled from the main repository anyway, the latter option might work as well (though the examples gallery repo requires the plugin then..).

If only a few files are needed, one option is to create a small script that downloads those files before running the tests.

Good idea, let's put it in a PR and continue from there.

aperezhortal commented 2 years ago

What do you mean with the initial commit just containing a README? You mean just putting the README in there without the rest of the files?

Yes. I mentioned that to make sure that the destination branch of the PR exists and leave it empty to avoid having work-in-progress code in the main branch.

RubenImhoff commented 2 years ago

16 reported issues from the Codacy report left. Most are very minor, but not sure how to fix those.

edit: down to 14 :)

dnerini commented 2 years ago

Excellent work @RubenImhoff !! I think we are very close to merging: tests are passing and the coverage delta is positive. Probably we can just ignore the remaining issues from codacy, I'll have a look.

Let give this a final look and see if everything is ready.

RubenImhoff commented 2 years ago

Can I ask you to have a final look on the prob matching and masking implementation? Or did you already do that?

dnerini commented 2 years ago

@RubenImhoff can you briefly explain how you're dealing with the external dependencies for the nwp importers? I assume we still need them for the examples and the tests?

RubenImhoff commented 2 years ago

can you briefly explain how you're dealing with the external dependencies for the nwp importers? I assume we still need them for the examples and the tests?

We don't need them anymore for the tests, luckily. I hard coded some metadata and made dummy data instead of importing the NWP data. The examples is a good point that I forgot to mention. For the blended_forecast (I believe that is the only one - I'll check), it still depends on the NWP importers. Since we are moving the examples to a separata repo (right?), we should think about how to incorporate that.

dnerini commented 2 years ago

can you briefly explain how you're dealing with the external dependencies for the nwp importers? I assume we still need them for the examples and the tests?

We don't need them anymore for the tests, luckily. I hard coded some metadata and made dummy data instead of importing the NWP data. The examples is a good point that I forgot to mention. For the blended_forecast (I believe that is the only one - I'll check), it still depends on the NWP importers. Since we are moving the examples to a separata repo (right?), we should think about how to incorporate that.

Ok, very good, thanks! I like the approach with the dummy input data, makes testing much slimmer, well done!

I don't see problems for having an additional external dependency for the examples. We just need to explain it at the beginning of the tutorial and add the importers plugin to the list of doc requirements once it is available on pypi.

RubenImhoff commented 2 years ago

I don't see problems for having an additional external dependency for the examples. We just need to explain it at the beginning of the tutorial and add the importers plugin to the list of doc requirements once it is available on pypi.

Agree, that would work! Plan B would be to work with dummy data in the example too, but that wouldn't look as nice.

dnerini commented 2 years ago

I don't see problems for having an additional external dependency for the examples. We just need to explain it at the beginning of the tutorial and add the importers plugin to the list of doc requirements once it is available on pypi.

Agree, that would work! Plan B would be to work with dummy data in the example too, but that wouldn't look as nice.

done in 146c6d5, have a look and change as you please ;-)

edit: I've activated the relative RTD built, but it'll succeed only when the pysteps-nwp-importers package is available on pypi

dnerini commented 2 years ago

@RubenImhoff @ladc currently the test_decomposition_NWP.py script in the examples folder cannot be rendered in the example gallery because is missing the necessary formatting (mostly the docstrings at the top of the file).

Do you want to include it or should we simply remove it from the folder?

dnerini commented 2 years ago

Also, @RubenImhoff I'm getting a ValueError: R_models does not contain sufficient lead times for this forecast from the blending example when building the documentation:

Traceback (most recent call last):
  File "/Users/daniele/Dropbox/Repos/pysteps/examples/blended_forecast.py", line 154, in <module>
    precip_forecast = blending.steps.forecast(
  File "/Users/daniele/Dropbox/Repos/pysteps/pysteps/blending/steps.py", line 349, in forecast
    _check_inputs(R, R_d_models, V, V_models, timesteps, ar_order)
  File "/Users/daniele/Dropbox/Repos/pysteps/pysteps/blending/steps.py", line 1529, in _check_inputs
    raise ValueError(
ValueError: R_models does not contain sufficient lead times for this forecast

Any idea what the cause could be?

RubenImhoff commented 2 years ago

Do you want to include it or should we simply remove it from the folder?

Removing it from the folder is good with me!

RubenImhoff commented 2 years ago

Also, @RubenImhoff I'm getting a ValueError: R_models does not contain sufficient lead times for this forecast from the blending example when building the documentation:

Traceback (most recent call last):
  File "/Users/daniele/Dropbox/Repos/pysteps/examples/blended_forecast.py", line 154, in <module>
    precip_forecast = blending.steps.forecast(
  File "/Users/daniele/Dropbox/Repos/pysteps/pysteps/blending/steps.py", line 349, in forecast
    _check_inputs(R, R_d_models, V, V_models, timesteps, ar_order)
  File "/Users/daniele/Dropbox/Repos/pysteps/pysteps/blending/steps.py", line 1529, in _check_inputs
    raise ValueError(
ValueError: R_models does not contain sufficient lead times for this forecast

Any idea what the cause could be?

I'll have a look at it.

RubenImhoff commented 2 years ago

I think I found the problem: the NWP forecast was not correctly indexed (the end index was missing). I added it, but couldn't test it myself, due to still the same netcdf problem when reading the bom data (still no clue why and why only with the bom data..). Could you maybe test if the example runs locally?

Anyway, I also changed the file path and name towards the NWP forecast, because the NWP names are no longer in pystepsrc. It's a hard coded path now, which is probably not ideal. How should we read this with pysteps?

RubenImhoff commented 2 years ago

The Black test seems to get stuck at the jupyter notebook example, should we exclude it from the test somewhere?

dnerini commented 2 years ago

I think I found the problem: the NWP forecast was not correctly indexed (the end index was missing). I added it, but couldn't test it myself, due to still the same netcdf problem when reading the bom data (still no clue why and why only with the bom data..). Could you maybe test if the example runs locally?

It worked for me!

Anyway, I also changed the file path and name towards the NWP forecast, because the NWP names are no longer in pystepsrc. It's a hard coded path now, which is probably not ideal. How should we read this with pysteps?

Right, good point. I need to think more about this... (also the doc built is failing because the hardcoded path is not correct)

edit: the more I look at it, the more I think the "data_sources" section of the pystepsrc should be hosted in the pysteps-data repo...

The Black test seems to get stuck at the jupyter notebook example, should we exclude it from the test somewhere?

Turned out it was something else causing it to fail (see 5cd752b ).

RubenImhoff commented 2 years ago

Got the blended_forecast example also working locally again - everything seems to work fine now. Looks like we're getting there! What steps are left, just a final review?

dnerini commented 2 years ago

Got the blended_forecast example also working locally again - everything seems to work fine now. Looks like we're getting there! What steps are left, just a final review?

There is a problem with the RTD builds that I need to investigate. It doesn't appear to be linked to this branch, but it would be anyway good to check how the new documentation looks like before merging.

Also maybe we can work a bit the linear blending example? As it is now it feels a bit too little verbose and too much code to be a tutorial imho. Also, maybe we can replace the dummy nwp with actual data?

Otherwise yes, we could do a final round of review, trying to involve a larger audience if possible? @ladc @cvelascof @loforest @aitaten

RubenImhoff commented 2 years ago

I can have a look at the linear blending example - I'll make it a little more comparable to the blended_forecast example

dnerini commented 2 years ago

doc is building, here: https://pysteps.readthedocs.io/en/blending-v1

RubenImhoff commented 2 years ago

Hmm codacy was happy with me this time ;)

dnerini commented 2 years ago

Personally I think this is ready to be merged!

RubenImhoff commented 2 years ago

Nice work with the variable names, @dnerini! That was was still somewhere on my to-do list, so great it's done. :)

RubenImhoff commented 2 years ago

In the original STEPS paper and technical note (in particular, the BPS2004 technical note), the lag-1 and lag-2 parameters are allowed to regress to their climatological values (eq. 12 - 19). By doing so, the Phi parameters change over time, which enhances the AR process. I have not implemented this (yet) in the steps blending module and saw that it also isn't part of the pysteps steps nowcasting module (right?). Is it worth implementing this, or is it left out of pysteps for a reason?

Other than that, this PR is good to go! :)

dnerini commented 2 years ago

In the original STEPS paper and technical note (in particular, the BPS2004 technical note), the lag-1 and lag-2 parameters are allowed to regress to their climatological values (eq. 12 - 19). By doing so, the Phi parameters change over time, which enhances the AR process. I have not implemented this (yet) in the steps blending module and saw that it also isn't part of the pysteps steps nowcasting module (right?). Is it worth implementing this, or is it left out of pysteps for a reason?

Other than that, this PR is good to go! :)

good that you mention it (and perhaps it could be included in the notes section of the method itself), but I'd leave this as "future development" ;-)

RubenImhoff commented 2 years ago

As discussed with @dnerini, the PR is ready to be merged. Any further developments can go through separate PRs.