bradyrx commented 5 years ago

Description

This PR allows a single time series to be correlated with a grid via xr.broadcast. It also changes the lag keyword to lead for more clarity and allows negative values.

E.g., lead=3 has x lead y by 3 time steps. lead=-3 has x lag y by 3 time steps.

Fixes # https://github.com/bradyrx/esm_analysis/issues/29

Type of change

Please delete options that are not relevant.

[X] Bug fix (non-breaking change which fixes an issue)
[X] New feature (non-breaking change which adds functionality)

How Has This Been Tested?

I've tested with some CESM output to compare a gridded output to a single climate index time series.

Checklist (while developing)

[X ] I have added docstrings to all new functions.
[X] I have commented my code, particularly in hard-to-understand areas
[x] Tests added for pytest, if necessary.
[X] I have updated the sphinx documentation, if necessary.

Pre-Merge Checklist (final steps)

[ ] I have rebased onto master or develop (wherever I am merging) and dealt with any conflicts.
[ ] I have squashed commits to a reasonable amount, and force-pushed the squashed commits.
[ ] I have run make html on the documents to make sure example notebooks still compile.

bradyrx commented 5 years ago

@aaronspring should this be changed at the highest level, i.e. in climpred? My impression on climpred stats is that only functions used internally should be there. We do some correlations there, but do we ever need to correlate a time series to a grid currently? Maybe I should change the lag/lead language there and allow negative values.

Also, I'll make sure to add a test before merging this. It's probably good to always add testing on PRs now to start chipping away at that.

aaronspring commented 5 years ago

Generally I think we only need to implement stats functions here for esm_a or for climpred. Having both has no added value. rm_trend, autocorr, ...

I like the new clarity for lead here. In climpred we don’t specifically need this scalar to grid broadcasting.

bradyrx commented 5 years ago

Generally I think we only need to implement stats functions here for esm_a or for climpred. Having both has no added value. rm_trend, autocorr, ...

The reason behind having them here is so that you don't have to have climpred installed to get access to common functions, so I just wrap them here. There's some students in my lab using the package now and I don't want them to have to install climpred and follow its API to get access to autocorr and stuff like that.

I know it's redundant, but I want this to just be a kitchen sink of useful functions. Thanks for looking over it!

bradyrx commented 5 years ago

@aaronspring , can you check my test_stats specifically? I don't fully understand when to use fixtures. I was going to use fixtures for the data generation functions as in climpred, but I needed to call one of them twice which it doesn't seem you can do for fixtures.

Also, a thought. We could have all of the stats here and have climpred just import them from esm_analysis to avoid redundancy. And have climpred handle only what it needs for prediction. But I don't want to force people to install our kitchen sink package too.

aaronspring commented 5 years ago

And have climpred handle only what it needs for prediction. But I don't want to force people to install our kitchen sink package too.

rm_trend is not directly needed but convenient. I propose to kick all stats functions which are not used in notebooks and by any prediction functions

aaronspring commented 5 years ago

When you follow my suggestion you will get two times the same ds. Don’t know how to get two different.. maybe split the ds

bradyrx commented 5 years ago

@aaronspring , see my updates for the implementation. You can write a wrapper inside the fixture so it can be called multiple times. Will merge after tests run.

bradyrx / esmtools

Update correlation to handle single time series #58

Description

Type of change

How Has This Been Tested?

Checklist (while developing)

Pre-Merge Checklist (final steps)