Mixed precision (16-bit) training error

NVIDIA / MinkowskiEngine

Minkowski Engine is an auto-diff neural network library for high-dimensional sparse tensors

https://nvidia.github.io/MinkowskiEngine

Other

2.42k stars 356 forks source link

Mixed precision (16-bit) training error #392

Open asadabbas09 opened 3 years ago

asadabbas09 commented 3 years ago

I'm trying to use 16 bit precision in pytorch lightning to save some gpu memory, but I'm getting this error:

RuntimeError:MinkowskiEngine/src/convolution_gpu.cu:69, assertion (in_feat.scalar_type() == kernel.scalar_type()) failed. type mismatch

Is there a way to fix this error?

jh-chung1 commented 2 years ago

same problem..

Ltwicke commented 2 years ago

I also get that error and i dont understand, how im supposed to influence the data type of the kernel, as im using standard ME convs

EDIT: I fixed the problem by adjusting dtypes in the creation of my sparse tensors. dtype=torch.int16 did it for me

houyongkuo commented 2 years ago

same problem

houyongkuo commented 2 years ago

I carefully compared my input with the input of FCGF, Minkowskiengine/example/reconstruction.py&completion.py, and found that the dtype in my input is different from the others, mine is float64, the others are float32, guess this should be the reason error caused by

luoao-kddi commented 11 months ago

Hi all, I met this problem too, I found it occurs after removing nn.Sequential in my code.

work well: self.module = nn.Sequential(ME.MinkowskiConvolution(3, 128, 3, dimension=3))

error: self.module = ME.MinkowskiConvolution(3, 128, 3, dimension=3)

ZiliangMiao commented 6 months ago

same problem, you have to make sure the tensor type of features is torch.float32.

Tortoise0Knight commented 3 months ago

Same problem. Does that mean Mink doesn't support fp16 precision