Open ThomasDelteil opened 6 years ago
Shouldn't it return the same error than when trying to add from CPU ? Note that sometimes I get Illegal Memory Access
error instead of invalid handle
===== edit I realized that after training my multi gpu model, where I stored the loss for each GPU in separate values on separate GPUs I am able to add these losses together without copying them across devices. Can someone explain to me why this is possible? I thought you cannot add across GPUs ?
train_loss
[
[ 49.52454758]
<NDArray 1 @gpu(0)>,
[ 49.66656113]
<NDArray 1 @gpu(1)>]
train_loss[0] + train_loss[1]
[ 99.1911087]
<NDArray 1 @gpu(0)>
train_loss[1] + train_loss[0]
[ 99.1911087]
<NDArray 1 @gpu(1)>
Thanks for submitting this issue @ThomasDelteil Could you add labels "Memory", "Bug" to this?
@kalyc I am not a committer and do not have labelling rights @nswamy could you add the labels please?
Description
When adding NDArray on different contexts, I get either:
Environment info (Required)
Build info (Required if built from source)
pip install mxnet-cu91mkl --pre
also happens with
1.2.0
pip install mxnet-cu91
Error Message:
Steps to reproduce
(Paste the commands you ran that produced the error.)