Closed zsrkmyn closed 7 years ago
Hi,
This is expected behavior because of how device side assert work on CUDA.
If you want the error to be raised at the correct place, you can set the following environment variable CUDA_LAUNCH_BLOCKING=1
to make the cuda api synchronous and raise the error when it occurs but that will slow down your code.
@albanD thanks a lot! I think I will set CUDA_LAUNCH_BLOCKING=1
when I am debugging and unset it when running.
I am not sure whether the issue should be owned by cutorch or cunn.
When I am using LookupTable with cuda, and the given index is out of range of the lookup table, torch doesn't report the error at once, instead, the error occurs when the following layer begin to forward.
Here is an example:
the output is:
As we can see in the output, we can even print the size of the result from the
lut
and only when the linear layer begins to forward it, the error occurs. if I change GPU to CPU, there will be no such problem. Although this is not a big issue, but it can be really confusing when debugging the code.I am using CUDA 8.0, torch7 built from torch/torch7@7c26baf, cunn built from torch/cunn@b9ab0f7, cutorch built from torch/cutorch@181a869, nn built from torch/nn@22ffc4f.
I am pleasure to provide more details if you need :-)