Bootstrapping persistence and skill

pangeo-data / climpred

:earth_americas: Verification of weather and climate forecasts :earth_africa:

https://climpred.readthedocs.io

MIT License

233 stars 48 forks source link

Bootstrapping persistence and skill #60

Closed aaronspring closed 5 years ago

aaronspring commented 5 years ago

My version of https://github.com/bradyrx/climpred/issues/46

So far I only bootstrapped the threshold of skill from an uninitialized ensemble. My persistence forecast and the signal was without any uncertainty.

I will implement a bootstrapping on those.

Expected result: http://hdl.handle.net/21.11116/0000-0002-0A63-4 Fig. 4.7, but also applicable to maps.

aaronspring commented 5 years ago

I will propose a different way to calculate the persistence forecast. I will only calculate the persistence forecast of the initialisations from the prediction ensemble. for your dple/lens case there will be no difference, as you take all inits, but I only take 12 samples.

I bootstrap with replacement as in Goodard et al. 2013: persistence_bootstrap Before is the persistence forecast over all init dates from the control, the colors are for the 12 inits only.

Bootstrapping with replacement will give me confidence intervals (CI) for persistence, initialised and uninitialized skill. I reproduced a skill figures from Li et al. 2016 (see examplanation in her paper Fig. 3a-c), where a statistician was also involved in. skill_bootstrap

What do you think about the approach @bradyrx ? With this, all skills will also have confidence intervals based on bootstrapping/Monte Carlo methods.

I will implement the functions tomorrow, but wanted to here your opinion.

aaronspring commented 5 years ago

to me it seems the same kind of results as in https://www.earth-syst-dynam.net/10/45/2019/esd-10-45-2019-discussion.html but not with z-score derived p-values but from bootstrapping.

Refs:

Li, Hongmei, Tatiana Ilyina, Wolfgang A. Müller, and Frank Sienz. “Decadal Predictions of the North Atlantic CO2 Uptake.” Nature Communications 7 (March 30, 2016): 11076. https://doi.org/10/f8wkrs.
Efron, Bradley, and R. J. Tibshirani. An Introduction to the Bootstrap. 1 edition. New York: Chapman and Hall/CRC, 1993.
Goddard, L., A. Kumar, A. Solomon, D. Smith, G. Boer, P. Gonzalez, V. Kharin, et al. “A Verification Framework for Interannual-to-Decadal Predictions Experiments.” Climate Dynamics 40, no. 1–2 (January 1, 2013): 245–72. https://doi.org/10/f4jjvf.

I nice test would be if results are reasonably close comparing both approaches.

bradyrx commented 5 years ago

Will hopefully get a chance to review this today. If not today, will review tomorrow.

aaronspring commented 5 years ago

I get these kind of results for every other forecast (uninitialized or persistence) with initialized. skill, 2 CI levels and a pvalue, all over 20 time (leads)

To simplify the output, I propse to merge all these into on xr.Dataset.

<xarray.Dataset>
Dimensions:  (results: 4, time: 20)
Coordinates:
  * results  (results) object 'skill' 0.025 0.975 'p'
  * time     (time) int64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Data variables:
    tos      (results, time) float64 0.8662 0.4955 0.3368 ... 0.08 0.26 0.21

I am not yet happy with the wording though. Do you have a better idea? see notebook also

aaronspring commented 5 years ago

move discussion to https://github.com/bradyrx/climpred/pull/78

bradyrx commented 5 years ago

Closing this since it is being addressed in #78