Closed fishbotics closed 6 months ago
@ys-2020, could you please take a look at this issue when you have time? Thanks!
Hi @fishbotics. Thank you for your interest! Could you please further describe what does it mean that "it does crash in the original terminal (from which the data was saved)"? From my understanding, there shouldn't be much difference between terminals if the previous job has been stopped. (And the problem seems less likely to be caused by TorchSparse if that is the case.)
I also meet this problem.
set_kmap_mode(hashmap), this error will disappear. Why?
Merged into issue #239.
Is there an existing issue for this?
Current Behavior
I am using ResNet21D and have been getting Illegal Memory Access errors. This started happening when I shrunk my voxel size down, but I'm not sure why it's happening. I'd like to share some sample data but I've had a hard time making it reproducible. In this bug, I'd like to ask two things:
Here's the error I'm getting reliably:
Note that
x
is the output from some other torch sparse operations. The failure is happening when I pass it into a 3D Conv that looks as sopc_encoder_2 = spnn.Conv3d(16, 32, 2, stride=2, dilation=1)
I tried to make this reproducible by saving the input that's causing the crash. To do this, I used
But when I load this in another terminal and send it to the GPU and create an instance of the model above, it doesn't crash in the other terminal. However. it does crash in the original terminal (from which the data was saved).
So my questions here are:
1) Any suggestions for how to make this reproducible? I'd like to be able to give you all a proper repro that you could use to help debug. 2) Any ideas on how to fix this issue (without a repro).
Expected Behavior
I don't expect this to crash.
Environment
Anything else?
I will gladly upload repro data if you can help me figure out how to save the data with the right info!