SciTools / iris

A powerful, format-agnostic, and community-driven Python package for analysing and visualising Earth science data
https://scitools-iris.readthedocs.io/en/stable/
BSD 3-Clause "New" or "Revised" License
633 stars 283 forks source link

Lazy SUM with 1D weights sometimes fails #5083

Open schlunma opened 1 year ago

schlunma commented 1 year ago

🐛 Bug Report

Collapsing a cube with lazy data using iris.analysis.SUM and 1D weights

cube = cube.collapsed('a', iris.analysis.SUM, weights=[0, 1]) 

sometimes fails due to broadcasting errors:

Traceback (most recent call last):
  File "/home/b/b309141/scripts/iris/cube_weights.py", line 24, in <module>
    cube = cube.collapsed('a', iris.analysis.SUM, weights=[0, 1])
  File "/work/bd0854/b309141/mambaforge/envs/esm/lib/python3.10/site-packages/iris/cube.py", line 3785, in collapsed
    data_result = aggregator.lazy_aggregate(
  File "/work/bd0854/b309141/mambaforge/envs/esm/lib/python3.10/site-packages/iris/analysis/__init__.py", line 545, in lazy_aggregate
    return self.lazy_func(data, axis=axis, **kwargs)
  File "/work/bd0854/b309141/mambaforge/envs/esm/lib/python3.10/site-packages/iris/analysis/__init__.py", line 1136, in inner_stat
    dask_result = dask_stats_function(array, axis=axis, **kwargs)
  File "/work/bd0854/b309141/mambaforge/envs/esm/lib/python3.10/site-packages/iris/analysis/__init__.py", line 1136, in inner_stat
    dask_result = dask_stats_function(array, axis=axis, **kwargs)
  File "/work/bd0854/b309141/mambaforge/envs/esm/lib/python3.10/site-packages/iris/analysis/__init__.py", line 1408, in _lazy_sum
    wsum = da.sum(weights_in * array, **kwargs)
  File "/work/bd0854/b309141/mambaforge/envs/esm/lib/python3.10/site-packages/dask/array/core.py", line 223, in wrapper
    return f(self, other)
  File "/work/bd0854/b309141/mambaforge/envs/esm/lib/python3.10/site-packages/dask/array/core.py", line 2349, in __rmul__
    return elemwise(operator.mul, other, self)
  File "/work/bd0854/b309141/mambaforge/envs/esm/lib/python3.10/site-packages/dask/array/core.py", line 4762, in elemwise
    broadcast_shapes(*shapes)
  File "/work/bd0854/b309141/mambaforge/envs/esm/lib/python3.10/site-packages/dask/array/core.py", line 4690, in broadcast_shapes
    raise ValueError(
ValueError: operands could not be broadcast together with shapes (2,) (2, 3)

How To Reproduce

import dask.array as da
import iris
print("iris version:", iris.__version__)
print()
from iris.coords import AuxCoord
from iris.cube import Cube
aux_coord = AuxCoord([1, 2], var_name='a')
cube = Cube(da.arange(2 * 3).reshape(2, 3), aux_coords_and_dims=[(aux_coord, 0)])
cube = cube.collapsed('a', iris.analysis.SUM, weights=[0, 1])

Expected behaviour

No fail.

Environment

rcomer commented 1 year ago

I've confirmed this failure is still reproduced in the v3.4 release candidate. This is only a problem for the lazy case because in the real case collapsed always moves the aggregation dimension(s) to the trailing dim

https://github.com/SciTools/iris/blob/57647d2c413dc3b85ecdde566a6340c9e43ad83b/lib/iris/cube.py#L3922-L3932

That said, if you pass your weights as a list in the real case, you get a failure here

https://github.com/SciTools/iris/blob/57647d2c413dc3b85ecdde566a6340c9e43ad83b/lib/iris/cube.py#L3936

Possibly we just need to use broadcast_to_shape before

https://github.com/SciTools/iris/blob/57647d2c413dc3b85ecdde566a6340c9e43ad83b/lib/iris/analysis/__init__.py#L1620

trexfeathers commented 1 year ago

Probably best to look at this AFTER the changes in #5084!

schlunma commented 1 year ago

Yes, definitely! I actually noticed this bug while writing tests for #5084 (it is still present after applying the changes from this PR).