pydata / xarray

N-D labeled arrays and datasets in Python
https://xarray.dev
Apache License 2.0
3.62k stars 1.08k forks source link

Concatenate 3D array with 2D array #3954

Open zxdawn opened 4 years ago

zxdawn commented 4 years ago

The 3D array has three dims: z, y and x. The 2D array has two dims: y and x. When I try to concatenate them by expanding the 2D array with z dim, there's something wrong in _dataset_concat

MCVE Code Sample

import xarray as xr
import numpy as np

x = 2
y = 4
z = 3
data = np.arange(x*y*z).reshape(z, x, y)

# 3d array with coords
a = xr.DataArray(data, dims=['z', 'y', 'x'], coords={'z': np.arange(z)})

# 2d array without coords
b = xr.DataArray(np.arange(x*y).reshape(x, y)*1.5, dims=['y', 'x'])

# expand 2d to 3d
b = b.expand_dims('z')

# concat
comb = xr.concat([a, b], dim='z')

Expected Output

Same as np.concatenate:

concat_array = np.concatenate((a, b))
comb = xr.DataArray(concat_array, dims={'z', 'y', 'x'})
<xarray.DataArray (z: 4, x: 2, y: 4)>
array([[[ 0. ,  1. ,  2. ,  3. ],
        [ 4. ,  5. ,  6. ,  7. ]],

       [[ 8. ,  9. , 10. , 11. ],
        [12. , 13. , 14. , 15. ]],

       [[16. , 17. , 18. , 19. ],
        [20. , 21. , 22. , 23. ]],

       [[ 0. ,  1.5,  3. ,  4.5],
        [ 6. ,  7.5,  9. , 10.5]]])
Dimensions without coordinates: z, x, y

Problem Description

    comb = xr.concat([a, b], dim='z')
  File "E:\miniconda3\envs\satpy\lib\site-packages\xarray\core\concat.py", line 135, in concat
    return f(objs, dim, data_vars, coords, compat, positions, fill_value, join)
  File "E:\miniconda3\envs\satpy\lib\site-packages\xarray\core\concat.py", line 455, in _dataarray_concat
    join=join,
  File "E:\miniconda3\envs\satpy\lib\site-packages\xarray\core\concat.py", line 395, in _dataset_concat
    raise ValueError("%r is not present in all datasets." % k)
ValueError: 'z' is not present in all datasets.

As suggested by @dcherian, assigning the coordinate label by changing b = b.expand_dims('z') to b = b.expand_dims(z=[3]) makes it work.

Versions

Output of `xr.show_versions()` INSTALLED VERSIONS ------------------ commit: None python: 3.7.6 | packaged by conda-forge | (default, Jan 7 2020, 21:48:41) [MSC v.1916 64 bit (AMD64)] python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 158 Stepping 9, GenuineIntel byteorder: little LC_ALL: None LANG: None LOCALE: None.None libhdf5: None libnetcdf: None xarray: 0.15.1 pandas: 1.0.3 numpy: 1.18.1 scipy: 1.4.1 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.1.1.2 nc_time_axis: None PseudoNetCDF: None rasterio: 1.1.3 cfgrib: None iris: None bottleneck: None dask: 2.10.1 distributed: 2.14.0 matplotlib: 3.2.1 cartopy: 0.17.0 seaborn: 0.10.0 numbagg: None setuptools: 46.1.3.post20200325 pip: 20.0.2 conda: None pytest: None IPython: 7.13.0 sphinx: 2.4.4
fujiisoup commented 4 years ago

Hi, @zxdawn

Thank you for raising the issue. I think you need an actual value of z as your b.expand_dims('z') does not have a value for z but it only knows the z is the dimension name.

You can do like

b['z'] = 3  # add a scalar coordinate named 'z'

to add a value (we call it coordinate) for z Then, your script will work,

b = b.expand_dims('z')  # expand 2d to 3d
comb = xr.concat([a, b], dim='z')
zxdawn commented 4 years ago

Thanks, @fujiisoup . @dcherian decided to improve the error message later. So, I will leave this open.

fujiisoup commented 4 years ago

Ah, OK. Makes sense. Thanks.

JavierRuano commented 4 years ago

import xarray as xr import numpy as np

x = 2 y = 4 z = 3 data = np.arange(xyz).reshape(z, x, y)

3d array with coords

a = xr.DataArray(data, dims=['z', 'y', 'x'], coords={'z': np.arange(z)})

2d array without coords

b = xr.DataArray(np.arange(xy).reshape(x, y)1.5, dims=['y', 'x'])

expand 2d to 3d

b = b.assign_coords({'z':3})

comb = xr.concat([a, b], dim='z')

perhaps you need another thing. http://xarray.pydata.org/en/stable/generated/xarray.concat.html ** consist of variables and coordinates with matching shapes

if you compare your shape are differents a.shape and b.shape Regards. Javier Ruano.

El mié., 8 abr. 2020 a las 1:36, Xin Zhang (notifications@github.com) escribió:

The 3D array has three dims: z, y and x. The 2D array has two dims: y and x. When I try to concatenate them by expanding the 2D array with z dim, there's something wrong in _dataset_concat MCVE Code Sample

import xarray as xrimport numpy as np

x = 2 y = 4 z = 3 data = np.arange(xyz).reshape(z, x, y)

3d array with coords

a = xr.DataArray(data, dims=['z', 'y', 'x'], coords={'z': np.arange(z)})

2d array without coords

b = xr.DataArray(np.arange(xy).reshape(x, y)1.5, dims=['y', 'x'])

expand 2d to 3d

b = b.expand_dims('z')

concat

comb = xr.concat([a, b], dim='z')

Expected Output

Same as np.concatenate:

concat_array = np.concatenate((a, b)) comb = xr.DataArray(concat_array, dims={'z', 'y', 'x'})

<xarray.DataArray (z: 4, x: 2, y: 4)> array([[[ 0. , 1. , 2. , 3. ], [ 4. , 5. , 6. , 7. ]],

   [[ 8. ,  9. , 10. , 11. ],
    [12. , 13. , 14. , 15. ]],

   [[16. , 17. , 18. , 19. ],
    [20. , 21. , 22. , 23. ]],

   [[ 0. ,  1.5,  3. ,  4.5],
    [ 6. ,  7.5,  9. , 10.5]]])

Dimensions without coordinates: z, x, y

Problem Description

comb = xr.concat([a, b], dim='z')

File "E:\miniconda3\envs\satpy\lib\site-packages\xarray\core\concat.py", line 135, in concat return f(objs, dim, data_vars, coords, compat, positions, fill_value, join) File "E:\miniconda3\envs\satpy\lib\site-packages\xarray\core\concat.py", line 455, in _dataarray_concat join=join, File "E:\miniconda3\envs\satpy\lib\site-packages\xarray\core\concat.py", line 395, in _dataset_concat raise ValueError("%r is not present in all datasets." % k) ValueError: 'z' is not present in all datasets.

If I change b = b.expand_dims('z') to b = b.expand_dims(z=3) as suggested by @dcherian https://github.com/dcherian , I still get the same error. Versions Output of xr.show_versions() INSTALLED VERSIONS

commit: None python: 3.7.6 | packaged by conda-forge | (default, Jan 7 2020, 21:48:41) [MSC v.1916 64 bit (AMD64)] python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 158 Stepping 9, GenuineIntel byteorder: little LC_ALL: None LANG: None LOCALE: None.None libhdf5: None libnetcdf: None

xarray: 0.15.1 pandas: 1.0.3 numpy: 1.18.1 scipy: 1.4.1 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.1.1.2 nc_time_axis: None PseudoNetCDF: None rasterio: 1.1.3 cfgrib: None iris: None bottleneck: None dask: 2.10.1 distributed: 2.14.0 matplotlib: 3.2.1 cartopy: 0.17.0 seaborn: 0.10.0 numbagg: None setuptools: 46.1.3.post20200325 pip: 20.0.2 conda: None pytest: None IPython: 7.13.0 sphinx: 2.4.4

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/issues/3954, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIGDFO45LDV75NEZWY5FKLDRLPILFANCNFSM4MDRZJZQ .

stale[bot] commented 2 years ago

In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity

If this issue remains relevant, please comment here or remove the stale label; otherwise it will be marked as closed automatically