Closed rbavery closed 5 years ago
Thanks for the bug report @rbavery! Can you try to update this example given the suggestions I made in #9 and see if you can get any closer?
I appreciate your patience and willingness to work with us on this new package.
Thanks @rabernat it works as expected. but should the time coordinates be dropped?
[88]
import numpy as np
import xarray as xr
t_series['reflectance'].name = 'reflectance'
bin_arr = np.linspace(rmin, rmax, nbins)
result = histogram(t_series['reflectance'].sel(band=1), bins=[bin_arr], dim=['x','y'])
[89]
result
<xarray.DataArray 'histogram_reflectance' (time: 44, reflectance_bin: 49)>
dask.array<shape=(44, 49), dtype=int64, chunksize=(1, 49)>
Coordinates:
* reflectance_bin (reflectance_bin) float64 -3.918 -3.755 ... 3.755 3.918
Dimensions without coordinates: time
the original xarray has time coordinates
[92]
t_series['reflectance'].sel(band=1)
<xarray.DataArray 'reflectance' (time: 44, y: 1082, x: 1084)>
dask.array<shape=(44, 1082, 1084), dtype=uint16, chunksize=(1, 1082, 1084)>
Coordinates:
band int64 1
* y (y) float64 9.705e+05 9.705e+05 9.705e+05 ... 9.673e+05 9.672e+05
* x (x) float64 4.889e+05 4.889e+05 4.889e+05 ... 4.922e+05 4.922e+05
* time (time) datetime64[ns] 2018-10-12 2018-10-16 ... 2019-05-26
Attributes:
transform: (3.0, 0.0, 488907.0, 0.0, -3.0, 970494.0)
crs: +init=epsg:32630
res: (3.0, 3.0)
is_tiled: 1
nodatavals: (1.0, 1.0, 1.0, 1.0)
scales: (1.0, 1.0, 1.0, 1.0)
offsets: (0.0, 0.0, 0.0, 0.0)
Thanks @rabernat it works as expected. but should the time coordinates be dropped?
This was a bug that was fixed in #8. We need a new release.
Hi @rabernat thanks for the suggestion on this stack overflow post: https://stackoverflow.com/questions/57419541/how-to-calculate-histogram-bins-for-each-image-in-an-xarray-dataarray-time-serie. I'm posting here since Stack Overflow's strict character limit cut out some of the code blocks below.
I've tried out the example with what I think are the correct arguments like so
where t_series.sel(blue=1) is a DataArray with time, x, and y dimensions. Calling
*data_arr
in the return statement unpacks the DataArray along the time dimension into separate DataArrays, if this is not done there is a length mismatch between the bin array list and the DataArray.The list of bin edge arrays and the list of Data Arrays match, and I want to take the histogram of each Data Array. I've tried with and without specifying dim=['x', 'y'], but I get the same error:
I'm not sure if this is a bug or if I'm doing something wrong. I tried to create a minimal example that doesn't require a file, but ran into a different issue, https://github.com/xgcm/xhistogram/issues/9
Happy to make this an example of how to use xhistogram with dask if we can sort this out and any feedback is appreciated!