pySTEPS / pysteps

Python framework for short-term ensemble prediction systems.
https://pysteps.github.io/
BSD 3-Clause "New" or "Revised" License
441 stars 160 forks source link

Bugfix reproducibility & ensemble member order with dask #347

Closed mpvginde closed 5 months ago

mpvginde commented 6 months ago

This PR (hopefully) fixes 2 issues:

337: Strange artifacts showing up in the STEPS nowcasting due to the random order of the ensemble members in the numpy array when using dask multi-threading

346: No reproducibility of the (blended) STEPS nowcast results even when a fixed seed argument is given

codecov[bot] commented 6 months ago

Codecov Report

Attention: Patch coverage is 81.81818% with 4 lines in your changes are missing coverage. Please review.

Project coverage is 83.32%. Comparing base (be8eea4) to head (a29aa05). Report is 1 commits behind head on master.

Files Patch % Lines
pysteps/noise/utils.py 76.92% 3 Missing :warning:
pysteps/io/importers.py 0.00% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #347 +/- ## ========================================== - Coverage 83.34% 83.32% -0.02% ========================================== Files 161 161 Lines 12364 12363 -1 ========================================== - Hits 10305 10302 -3 - Misses 2059 2061 +2 ``` | [Flag](https://app.codecov.io/gh/pySTEPS/pysteps/pull/347/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pySTEPS) | Coverage Δ | | |---|---|---| | [unit_tests](https://app.codecov.io/gh/pySTEPS/pysteps/pull/347/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pySTEPS) | `83.32% <81.81%> (-0.02%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pySTEPS#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

dnerini commented 6 months ago

Nice job @mpvginde! This looks fantastic and I'm curious to hear from @RubenImhoff about #337.

I was wondering, should we add an explicit test on reproducibility?

RubenImhoff commented 6 months ago

Great work, @mpvginde! I will have a look at the PR later this week. :)

dnerini commented 5 months ago

Hi @mpvginde should we try to push your fixes quickly to master and release them? Do you need any support of the issues that are still open or you think this is now ready to be merged?

mpvginde commented 5 months ago

Hi @dnerini, I noticed last week that in some cases, the result was also dependent on the number of dask workers used. This should be fixed now with the last commit that I added. I think this can be merged now. I'm not sure why the Black check is failing though, If somebody could help me with that.

I did notice that the method used for generating the random numbers is using some deprecated numpy-functions. If I find some more time I will try to update the generation of random number to the latest best practices. But for now this fix should be ok.

dnerini commented 5 months ago

I think this can be merged now.

Brilliant! Just wondering if we should include the fix for the linda method too, as mentioned by @RubenImhoff?

I'm not sure why the Black check is failing though, If somebody could help me with that.

maybe you're using a older version of black locally? during tests we are using the latest available version

dnerini commented 5 months ago

alright, I'm merging this, thanks a lot @mpvginde for the nice contribution!