apache / mxnet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
https://mxnet.apache.org
Apache License 2.0
20.78k stars 6.79k forks source link

Feature request: numpy.cumsum #13001

Closed eric-haibin-lin closed 4 years ago

eric-haibin-lin commented 5 years ago

MXNet equivalence of numpy.cumsum https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.cumsum.html?highlight=cumsum#numpy.cumsum

ChaiBapchya commented 5 years ago

Any use-case you have come across (specific to MXNet)?

szha commented 5 years ago

cumsum is needed when you have an array of lengths with which to slice another concatenated array, and you want to calculate the start and end index to use. For example, suppose you have three arrays of different lengths, and you concatenated them like: [1, 2, 3], [4, 5], [6, 7, 8, 9] => [1, 2, 3, 4, 5, 6, 7, 8, 9] Their corresponding lengths are [3, 2, 4]. The cumsum would be [3, 5, 9], with which you can slice [0, 3) for first array, [3, 5) for second array, and [5, 9) for third array.

eric-haibin-lin commented 5 years ago

For example the validation metric for object detection: https://github.com/dmlc/gluon-cv/blob/master/gluoncv/utils/metrics/voc_detection.py#L189-L211 Now it uses numpy and is extremely slow. You can get an order of magnitude speedup if the ops are available in mxnet

tsuberim commented 5 years ago

I'd really like to have this function for computing returns of rewards in the context of reinforcement learning.

MyYaYa commented 5 years ago

Yeah, if ndarray supports cumsum operation, some custom metrics(i.e. mAP) will benefit from that.

MyYaYa commented 5 years ago

I find a elegant method to implement cumsum. Hope this will be useful.

step = lambda data, state : (data + state[0], state[0] + data)
data = mx.nd.ones((10, 10))
state = mx.nd.zeros(10)
outs, state = mx.nd.contrib.foreach(step, data, state)
tsuberim commented 5 years ago

@MyYaYa Cool, is that also differentiable?

MyYaYa commented 5 years ago

@tsuberim yeah, as I know, the symbol api in MXNet support auto-differentiation, so you can try the mx.sym.contrib.foreach(), maybe that will work for you. here is the link

soldni commented 5 years ago

Thank you @MyYaYa for the solution! I've wrapped it up into a generic consum function here while we wait for official support in future versions.

chinakook commented 5 years ago

Cumsum is important because we need it to calculate Lovasz loss

MyYaYa commented 5 years ago

I find a elegant method to implement cumsum. Hope this will be useful.

step = lambda data, state : (data + state[0], state[0] + data)
data = mx.nd.ones((10, 10))
state = mx.nd.zeros(10)
outs, state = mx.nd.contrib.foreach(step, data, state)

Now, I find this solution is so slow, especially when the size of array is big. So this solution has become a main cost of time in my program. Is there any efficient solution of 'cumsum' using mxnet's ndarray?

chinakook commented 5 years ago

I think a better solution is to write a c++ version of this op.

crmne commented 5 years ago

It would be great in order to more efficiently implement Earth Mover's Loss

RuRo commented 5 years ago

It would be great to compute the CDF from the histogram.

ChaiBapchya commented 5 years ago

Taken note of all the above comments. All these use cases make this feature request very relevant. Tagging myself here as I start working on this.

haojin2 commented 4 years ago

Added in mxnet.np module now, closing.