CUDA kernel failed: no error (pkg/include/cuda/emd.cuh:247)

meder411 / PyTorch-EMDLoss

PyTorch 1.0 implementation of the approximate Earth Mover's Distance

136 stars 13 forks source link

CUDA kernel failed: no error (pkg/include/cuda/emd.cuh:247) #3

Open fsalmasri opened 5 years ago

fsalmasri commented 5 years ago

Could you help me with this error plz ?

CUDA kernel failed: no error (pkg/include/cuda/emd.cuh:247)

meder411 commented 5 years ago

Possibly out of of memory on the GPU. What are the dimensions of the data you're using and how much RAM does your GPU have? I'm still trying to figure out the best way to catch CUDA errors with a meaningful message. I've gotten the kernel failure with no error when I've tried to allocate too much memory on the GPU though.

I might be catching the last error too late in the code, which is why the error code suggests no error. I'll take a look when I have some time.

fsalmasri commented 5 years ago

I don't think it is memory problem, I just tested your test code. it worked on TITAN Xp correctly but on TITAN X arose this problem. is it possible to use it on tensor of this shape (X,Y,Z) ?

meder411 commented 5 years ago

I'm not sure what shape that entails. The arguments to the loss should have dimensions (B x N x D) where B is the batch size, N is the number of points, and D is the dimensionality (e.g. [X,Y,Z] is 3 dimensions).

fsalmasri commented 5 years ago

exactly this is what I meant. a 3 dimensions tensor where B is the batch size, N is the number of points or variables for each batch and D could be the histogram values of 10 bins.

fsalmasri commented 5 years ago

Do you have an idea why it worked on Titan Xp and not on Titan X ?

meder411 commented 5 years ago

I'm sorry, I don't know

nsarasua commented 5 years ago

Do you have an idea why it worked on Titan Xp and not on Titan X ?

Any news on this? I am getting the same error on my Titan X

kongsgard commented 5 years ago

On a GeForce GTX 1080 with total memory 8117MiB I can calculate the EMDLoss in the script test_emd_loss.py with tensors p1, p2 with a size of up to about [B,N,D] = [32,400,3]. If I increase N further, I get the same error message as commented above.

Ideally, I would like to increase N to about ~2000.

fsalmasri commented 5 years ago

No it works only on Titan Xp. Btw I didn't manage to converge my network using this EMD function. I can't go through the code so I make a comparison with other implementations of EMD.

meder411 commented 5 years ago

Hm. Okay. I’ll go through this again when I have some time. I cut it out of a bigger package so something may have broken when I was cleaning it up. Apologies for the issue

Jmq14 commented 5 years ago

Modifying BLOCK_SIZE to 1024 solved this issue in my case where the size of input tensors is [B,N,D] = [16, 1024, 3].

123helloworld123 commented 5 years ago

No it works only on Titan Xp. Btw I didn't manage to converge my network using this EMD function. I can't go through the code so I make a comparison with other implementations of EMD.

Did you find some other implementations of EMD that can be used to compare two grayscale image?

louis-cl commented 5 years ago

I have the same problem. In my case the error only arises when N > 512. Doesn't look like a memory error as it works for Bx512x3 (tried up to 100). Changing BLOCK_SIZE to 1024 as @Jmq14 proposed allows it to N=1024 but 1025 fails... I am using a Titan Xp.

noahstier commented 4 years ago

I had the same issue and I found that this implementation was able to handle more points: https://github.com/daerduoCarey/PyTorchEMD

yikakabu commented 3 years ago

@fsalmasri Hi. I met the same error on Titan X with the tensor shape[32, 1024, 4]. Have you solved this issue?

dxfhu2012 commented 1 year ago

@fsalmasri Hi. I met the same error on P104-100 ,but I can use the same tensor shape on the 1660ti or 2080ti