Ouranosinc / xclim

Library of derived climate variables, ie climate indicators, based on xarray.
https://xclim.readthedocs.io/en/stable/
Apache License 2.0
333 stars 59 forks source link

In subset_gridpoint, if year start is after end of dataset, to_netcdf() fails without an informative message #190

Closed huard closed 5 years ago

huard commented 5 years ago

Maybe this should be taken to xarray directly.

tlogan2000 commented 5 years ago

This should be a pretty simple check to add on our end. We can throw an error if there is no temporal overlap. I would think that the same issue occurs for subset_bbox as well

huard commented 5 years ago
if da.time.sel(time=start_date).min() > da.time.sel(time=end_date).max():

fails if end_date is not in selection.

Zeitsperre commented 5 years ago

Looking into this as well. I think it's worse than we think:

In [1]: import xarray as xr                                                                                                                                       

In [2]: from xclim import subset                                                                                                                                  

In [3]: a = "sftlf_fx_MRI-CGCM3_rcp85_r0i0p0.nc"                                                                                                                  

In [4]: ds = xr.open_dataset(a)

In [5]: f = subset.subset_gridpoint(ds, lon=-75., lat=45.)                                                                                                        

In [6]: f                                                                                                                                                         
Out[6]: 
<xarray.Dataset>
Dimensions:   (bnds: 2)
Coordinates:
    lat       float64 45.42
    lon       float64 0.0
Dimensions without coordinates: bnds
Data variables:
    lat_bnds  (bnds) float64 ...
    lon_bnds  (bnds) float64 ...
    sftlf     float32 ...

For the two test files, the longitudes are in 0-360 degrees so conversion is necessary. The algorithm in xclim master should be able to handle this. but as you can see, the longitude in the subset gridpoint is 0.0. Not only is there no error message, but the subset object is otherwise perfectly fine. I'll see what happens when run with the xclim currently on finch (v0.8-beta).

huard commented 5 years ago

We have to update finch anyway, so let's just fix xclim and update finch.

Zeitsperre commented 5 years ago

Just tried the same approach with xclim @ v0.8-beta: Exact same results and no error raised. Does this merit a new issue?

tlogan2000 commented 5 years ago

If I am following correctly we need to address two issues? One regarding the temporal subsetting end-date and start date having to be within the dataset bounds?

Second with respect to the adjustment of positive longitude values?

huard commented 5 years ago

That's my understanding. If the date is outside of bounds, should we raise an error, or think of it as a slice, so return data up until the end?

tlogan2000 commented 5 years ago

I would try to make it work as a slice i.e up until the max time in the ds We should be able to check to see the end_date is beyond ds.time.max(). If yes raise a warning and assign end_date as max time...

tlogan2000 commented 5 years ago

Looking into this as well. I think it's worse than we think:

In [1]: import xarray as xr                                                                                                                                       

In [2]: from xclim import subset                                                                                                                                  

In [3]: a = "sftlf_fx_MRI-CGCM3_rcp85_r0i0p0.nc"                                                                                                                  

In [4]: ds = xr.open_dataset(a)

In [5]: f = subset.subset_gridpoint(ds, lon=-75., lat=45.)                                                                                                        

In [6]: f                                                                                                                                                         
Out[6]: 
<xarray.Dataset>
Dimensions:   (bnds: 2)
Coordinates:
    lat       float64 45.42
    lon       float64 0.0
Dimensions without coordinates: bnds
Data variables:
    lat_bnds  (bnds) float64 ...
    lon_bnds  (bnds) float64 ...
    sftlf     float32 ...

For the two test files, the longitudes are in 0-360 degrees so conversion is necessary. The algorithm in xclim master should be able to handle this. but as you can see, the longitude in the subset gridpoint is 0.0. Not only is there no error message, but the subset object is otherwise perfectly fine. I'll see what happens when run with the xclim currently on finch (v0.8-beta).

I think I have found the issue in _check_lons() for the positive lons not working

We currently checks for np.all(args[0].lon > 0) # line 66 subset.py
or np.all(args[0].lon < 0) # line 71

However your test dataset has lon values of exactly 0 I think a simple fix to <=0 and >=0 will fix the issue Will be addressed in the fix for #271

tlogan2000 commented 5 years ago

closed via #271