apache / mxnet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
https://mxnet.apache.org
Apache License 2.0
20.77k stars 6.8k forks source link

Ndarray support extension problem #19332

Open LourisXu opened 3 years ago

LourisXu commented 3 years ago

For example, in parallel computing with multiple GPUs, I choose SigmoidBinaryCrossEntropyLoss as the loss function to calculate the loss. For the fourth parameter pos_weight needs to load multiple weights to each GPU at the same time, and then each GPU uses its corresponding weight to calculate the loss of parallel computing.

The code is shown below:

ctx = [mx.gpu(), mx.gpu(1))
pos_weights = []
for ctx_i in ctx:    # load multiple weights to gpus
    w = nd.array(pos_weight, ctx=ctx_i)
    pos_weights.append(w)

...

        for i, batch in enumerate(train_iter):
            Xs, ys, batch_size = _get_batch(batch, ctx)
            ls = []
            with autograd.record():
                y_hats = [net(X) for X in Xs]
                ls = [loss(y_hat, y, None, pos_weights[ctx.index(y.context)]) for y_hat, y in zip(y_hats, ys)]  # each gpu use its corresponding pos_weight to calcualte the loss
            for l in ls:
                l.backward()

If you don't, the following problems may arise!

mxnet/mshadow/mshadow/./stream_gpu-inl.h:62: Check failed: e == cudaSuccess CUDA: an illegal memory access was encountered

Check failed: type_ != nullptr: The any container is empty requested=N5mxnet

So, is there any other simpler method or API similar to model loading on multiple GPUs to achieve this purpose?

github-actions[bot] commented 3 years ago

Welcome to Apache MXNet (incubating)! We are on a mission to democratize AI, and we are glad that you are contributing to it by opening this issue. Please make sure to include all the relevant context, and one of the @apache/mxnet-committers will be here shortly. If you are interested in contributing to our project, let us know! Also, be sure to check out our guide on contributing to MXNet and our development guides wiki.