nums-project / nums

A library that translates Python and NumPy to optimized distributed systems code.
Apache License 2.0
131 stars 26 forks source link

Add support for advanced indexing. #125

Open elibol opened 3 years ago

elibol commented 3 years ago

Required for madelbrot benchmark: https://github.com/IntelPython/dpbench/blob/feature/dist/distributed/mandelbrot/mandelbrot_nums.py

See #106

briancpark commented 3 years ago

The error that comes out when I try to do advanced indexing to convert an array to binary classification array is this:

y_train[y_train > 0] = 1.0
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
/tmp/ipykernel_33143/1447231709.py in <module>
----> 1 y_train[y_train > 0] = 1.0

~/external/anaconda3/envs/aws-asdi/lib/python3.7/site-packages/nums/core/array/blockarray.py in __setitem__(self, key, value)
    360     def __setitem__(self, key, value):
    361         av: ArrayView = ArrayView.from_block_array(self)
--> 362         av[key] = value
    363 
    364     @staticmethod

~/external/anaconda3/envs/aws-asdi/lib/python3.7/site-packages/nums/core/array/view.py in __setitem__(self, key, value)
    213             return self.assign((key,), value)
    214         else:
--> 215             raise Exception("setitem failed", key)
    216 
    217     def assign(self, subscript: Tuple, value):

Exception: ('setitem failed', BlockArray([Block(ObjectRef(d48d33c49312269fffffffffffffffffffffffff0100000001000000))]))

A temporary fix is to convert it back to a NumPy array, do advanced indexing, and then back to NumS. The issue with that, is sometimes the block shape is modified before operations (in my case a train test split on a block partitioned array). Then a block reshape is also required

y_train_block_shape = y_train.block_shape
y_train = y_train.get()
y_train[y_train > 0] = 1
y_train = nps.array(y_train)
y_train = y_train.reshape(block_shape=y_train_block_shape)