Closed raybellwaves closed 4 years ago
do you want to apply crps on a dim other than member
?
fct
to one truth obs
.I agree I think we should always apply over member
.
In my example above to return shape lon
it would be over lat
and member
I think I want to stack similar to https://github.com/raybellwaves/xskillscore/blob/46fa8528741a0dcc6caf4e55a6079c3120766711/xskillscore/core/deterministic.py#L82 ... will come back to this
Braining storming: I'm just curious if in my example above (if that is correct)
xs.crps_ensemble(obs3.isel(lat=0), fct3.isel(lat=0))
could be written under the hood similar to other functions
def crps_ensemble(a, b, dim):
# Raise Error if member not in dim...
# reshape data to match user specified dim
a = a.isel(dim=0)
b = b.isel(dim=0)
# continue with existing code...
xs.crps_ensemble(obs3, fct3, dim=['lat', 'member'])
then the new method: xs.crps_ensemble(obs3, fct3, 'member')
should be same as old method xs.crps_ensemble(obs3, fct3)
I dont understand what you mean by lon shape
I dont understand why xs.crps_ensemble(obs3, fct3, dim=['lat', 'member'])
I dont understand what you mean by lon shape
Apply crps over lat and member.
I dont understand why
xs.crps_ensemble(obs3, fct3, dim=['lat', 'member'])
Similar to other metrics e.g. for apply pearson_r over lat and lon xs.pearson_r(obs3, fct3.isel(member=0), dim=['lat', 'lon'])
Goes back to the discussion we had of implementing multi-dim for deterministic: why?
Guess you could do crps over Nino3.4 region = xs.crps_ensemble(obs3, fct3, dim=['lat', 'lon', 'member'])
?
You can average over lon lat after apply to member. I think crps is only well defined over member.
You can average over lon lat after apply to member. I think crps is only well defined over member.
Probably best suggestion. Thanks
But probably averaging before and then calculating skill is also an option
But probably averaging before and then calculating skill is also an option
Was just going that ask. You got a preference? I feel average before is probably more interpretable
import xarray as xr
import numpy as np
import pandas as pd
import xskillscore as xs
o = xr.DataArray(
np.random.rand(3, 4, 5),
coords=[
pd.date_range("1/1/2000", "1/3/2000", freq="D"),
np.arange(4),
np.arange(5),
],
dims=["time", "lat", "lon"],
)
f = xr.DataArray(
np.random.rand(3, 4, 5, 6),
coords=[
pd.date_range("1/1/2000", "1/3/2000", freq="D"),
np.arange(4),
np.arange(5),
np.arange(6)
],
dims=["time", "lat", "lon", "member"],
)
o_avg = o.mean(dim=['lat', 'lon'])
f_avg = f.mean(dim=['lat', 'lon'])
xs.crps_ensemble(o_avg, f_avg, dim='member')
crps = xs.crps_ensemble(o, f, dim='member')
crps.mean(dim=['lat', 'lon'])
values are on order of magnitude larger in crps.mean(dim=['lat', 'lon'])
>>> xs.crps_ensemble(o_avg, f_avg, dim='member')
<xarray.DataArray (time: 3)>
array([0.00819853, 0.04241865, 0.03740562])
Coordinates:
* time (time) datetime64[ns] 2000-01-01 2000-01-02 2000-01-03
>>> crps = xs.crps_ensemble(o, f, dim='member')
>>>
>>> crps.mean(dim=['lat', 'lon'])
<xarray.DataArray (time: 3)>
array([0.16495624, 0.18920637, 0.1971401 ])
Coordinates:
* time (time) datetime64[ns] 2000-01-01 2000-01-02 2000-01-03
>>>
I would say it depends on what you want to check:
I wouldnt use the dim args in crps though:
crps = xs.crps_ensemble(o, f, dim='member')
This implies that you can calculate crps_ensemble
also over dim other than member which is not defined. I would also suggest to take this out of the README example.
@aaronspring I agree. welcome to to make a PR
The member dimension is a "special" dimension in the calculation of probabilistic metrics. Often "applying over multiple dims" for these metrics simply means "averaging metric over multiple dims" (for example, this is the case for crps as @aaronspring mentions above).
Given this, is it worth distinguishing between a "member_dim
" and "dim
" input to the probabilistic functions?
probably yes.
either we require member
as the member dim (as currently) or have an explicit keyword for this in every function member_dim='member'
.
So we could add dim to the args in crps and pass all dim item expect member into a mean function?
in the read me we don't specify the dim and it returns the shame same as obs
In other metrics you can specify the dim as
To get crps of shape lon do I just specify what goes into the metric?
I know dim in this metric refers to the ensemble dimension
dim='member'
https://github.com/raybellwaves/xskillscore/blob/master/xskillscore/core/probabilistic.py#L92I know there has been discussion on this before https://github.com/TheClimateCorporation/properscoring/issues/6