NCAS-CMS / PyActiveStorage

Python implementation of Active Storage
2 stars 2 forks source link

ValueError raised by reduce_chunk in PyActiveStorage - induced by a chunk of all zeros being treated as all missing? #194

Closed bnlawrence closed 6 months ago

bnlawrence commented 8 months ago

An attempt to do some work on small chunks using PyActiveStorage version 1 raises a ValueError if the values of the array in the chunk is all zeros.

(mampy5) [15:09:00:sci4 bnl]$Traceback (most recent call last):
 File "/home/users/lawrence/repos/PyActiveStorage/bnl/bnl_timeseries_runner.py", line 15, in <module>
  timeseries(location, b, v, t)
 File "/home/users/lawrence/repos/PyActiveStorage/bnl/bnl_timeseries_ral.py", line 65, in timeseries
  ts.append(active[i,0,0:960,:][0])
       ~~~~~~^^^^^^^^^^^^^
 File "/home/users/lawrence/repos/PyActiveStorage/activestorage/active.py", line 213, in __getitem__
  return self._get_selection(index, __metric_name='selection 1 time (s)')
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/users/lawrence/repos/PyActiveStorage/activestorage/active.py", line 57, in timed
  result = method(self,*args, **kw)
       ^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/users/lawrence/repos/PyActiveStorage/activestorage/active.py", line 328, in _get_selection
  return self._from_storage(ds, indexer, array._chunks, out_shape, dtype, compressor, filters, drop_axes)
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/users/lawrence/repos/PyActiveStorage/activestorage/active.py", line 401, in _from_storage
  out = method(out)
     ^^^^^^^^^^^
 File "/home/users/lawrence/.conda/envs/mampy5/lib/python3.12/site-packages/numpy/core/fromnumeric.py", line 2313, in sum
  return _wrapreduction(a, np.add, 'sum', axis, dtype, out, keepdims=keepdims,
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/users/lawrence/.conda/envs/mampy5/lib/python3.12/site-packages/numpy/core/fromnumeric.py", line 88, in _wrapreduction
  return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (180,) + inhomogeneous part.

Tracking this down I find the bug occurring here:

if method:
        if missing != (None, None, None, None):
            tmp = remove_missing(tmp, missing)
        # check on size of tmp; method(empty) returns nan
        if tmp.any():
            return method(tmp), tmp.size # wanted this one
        else:
            return tmp, None.  # wrong thing returned
    else:
        return tmp, None

because although in this case the method is sum, the check on tmp.any() is returning False because tmp is all zeros, and so we get execution going through the line with #wrong thing returned. I think this test using tmp.any is not doing what the author intended. We need another solution.

bnlawrence commented 8 months ago

I think the new unit test should ensure that some chunks are all zeros ... but some are not ... and we get the right answer for version 1 and 2 ... and that it does raise this error with the current code.

davidhassell commented 8 months ago

np.ma.count(tmp) could be the answer.

bnlawrence commented 8 months ago

There are three issues in play here:

  1. Even if we go through the "wrong" thing route, we want it to return one missing value and a zero count, not a missing array.
  2. When they are all zeros, we want it to give the right answer.
  3. It occurs in both reduce_chunk and reduce_opens3_chunk

I think your suggestion will address the second ...

davidhassell commented 8 months ago

Your write right,. I should have said if tmp.size: could be the answer! (I hadn't realized that the tmp is np.ma.compressed).

davidhassell commented 8 months ago

A small put important change pointed out by @bnlawrence that should have made it int the original PR: #197

valeriupredoi commented 6 months ago

closed via #197 - very good catch and fix, gents :beer: