Constructing a sparse tensor with with 0 features on the gpu causes the first subsequent torch tensor creation on the gpu to fail. The same creation works on the second try after. This behavior is only observed on the gpu and for sparse tensors with 0 elements.
To Reproduce
Steps to reproduce the behavior. If the code is not attached and cannot be reproduced easily, the bug report will be closed without any comments.
import pytest
import torch
import MinkowskiEngine as ME
def test_weird_gpu_behavior():
"""
Reproduces a weird behavior observed when using the gpu.
:return:
"""
# constants
use_gpu = True # fails only for gpu
if use_gpu:
device = "cuda"
coordinate_map_type = ME.CoordinateMapType.CUDA
allocator_type = ME.GPUMemoryAllocatorType.CUDA
ME.set_gpu_allocator(allocator_type)
else:
device = "cpu"
coordinate_map_type =ME.CoordinateMapType.CPU
allocator_type = None
# setup minkowski engine
ME.set_sparse_tensor_operation_mode(ME.SparseTensorOperationMode.SHARE_COORDINATE_MANAGER)
minkowski_algorithm = ME.MinkowskiAlgorithm.SPEED_OPTIMIZED
num_threads = 1
coordinate_manager = ME.CoordinateManager(D=3,
num_threads=num_threads,
coordinate_map_type=coordinate_map_type,
minkowski_algorithm=minkowski_algorithm,
allocator_type=allocator_type,
)
# some features and coordinates
# needs to be size 0 to reproduce.
num_points = 0
coords = torch.zeros(size=[num_points, 3],
dtype=torch.int32,
device=device)
features = torch.zeros(size=[num_points, 3],
dtype=torch.float32,
device=device)
# add batch dimension
coords, feats = ME.utils.sparse_collate([coords], [features])
# works
some_tensor = torch.randn(size=[1, 4], dtype=torch.float32, device=device)
# creating a sparse tensor without any features breaks subsequent tensor constructions on the gpu
voxel_tensor = ME.SparseTensor(features=feats, coordinates=coords,
quantization_mode=ME.SparseTensorQuantizationMode.RANDOM_SUBSAMPLE,
coordinate_manager=coordinate_manager,
allocator_type=allocator_type)
with pytest.raises(RuntimeError):
# does not work, reason unknown
some_tensor = torch.randn(size=[1, 4], dtype=torch.float32, device=device)
# works again
some_tensor = torch.randn(size=[1, 4], dtype=torch.float32, device=device)
Expected behavior
I would expect the above code not to raise an exception when creating some_tensor.
Desktop (please complete the following information):
OS: [e.g. Ubuntu 20.04]
Python version: [e.g.Python 3.8.10]
Pytorch version: [e.g. 1.12.0+cu116]
CUDA version: [e.g. 11.6]
NVIDIA Driver version: [e.g.510.85.02]
Minkowski Engine version [e.g. 0.5.4]
Output of the following command. (If you installed the latest MinkowskiEngine, paste the output of python -c "import MinkowskiEngine as ME; ME.print_diagnostics()". Otherwise, paste the output of the following command.)
(minkowski_gpu) magnus@magnus-ThinkPad-P1-Gen-4i:~$ python -c "import MinkowskiEngine as ME; ME.print_diagnostics()"
==========System==========
Linux-5.15.0-48-generic-x86_64-with-glibc2.29
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=20.04
DISTRIB_CODENAME=focal
DISTRIB_DESCRIPTION="Ubuntu 20.04.5 LTS"
3.8.10 (default, Jun 22 2022, 20:18:18)
[GCC 9.4.0]
==========Pytorch==========
1.12.0+cu116
torch.cuda.is_available(): True
==========NVIDIA-SMI==========
/usr/bin/nvidia-smi
Driver Version 510.85.02
CUDA Version 11.6
VBIOS Version 94.04.51.00.53
Image Version G001.0000.03.03
GSP Firmware Version N/A
==========NVCC==========
sh: 1: nvcc: not found
==========CC==========
/usr/bin/c++
c++ (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
==========MinkowskiEngine==========
0.5.4
MinkowskiEngine compiled with CUDA Support: True
NVCC version MinkowskiEngine is compiled: 11060
CUDART version MinkowskiEngine is compiled: 11060
(minkowski_gpu) magnus@magnus-ThinkPad-P1-Gen-4i:~$ wget -q https://raw.githubusercontent.com/NVIDIA/MinkowskiEngine/master/MinkowskiEngine/diagnostics.py ; python diagnostics.py
==========System==========
Linux-5.15.0-48-generic-x86_64-with-glibc2.29
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=20.04
DISTRIB_CODENAME=focal
DISTRIB_DESCRIPTION="Ubuntu 20.04.5 LTS"
3.8.10 (default, Jun 22 2022, 20:18:18)
[GCC 9.4.0]
==========Pytorch==========
1.12.0+cu116
torch.cuda.is_available(): True
==========NVIDIA-SMI==========
/usr/bin/nvidia-smi
Driver Version 510.85.02
CUDA Version 11.6
VBIOS Version 94.04.51.00.53
Image Version G001.0000.03.03
GSP Firmware Version N/A
==========NVCC==========
sh: 1: nvcc: not found
==========CC==========
/usr/bin/c++
c++ (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
==========MinkowskiEngine==========
0.5.4
MinkowskiEngine compiled with CUDA Support: True
NVCC version MinkowskiEngine is compiled: 11060
CUDART version MinkowskiEngine is compiled: 11060
Additional context
Add any other context about the problem here.
Describe the bug
Constructing a sparse tensor with with 0 features on the gpu causes the first subsequent torch tensor creation on the gpu to fail. The same creation works on the second try after. This behavior is only observed on the gpu and for sparse tensors with 0 elements.
To Reproduce Steps to reproduce the behavior. If the code is not attached and cannot be reproduced easily, the bug report will be closed without any comments.
Expected behavior I would expect the above code not to raise an exception when creating
some_tensor
.Desktop (please complete the following information):
python -c "import MinkowskiEngine as ME; ME.print_diagnostics()"
. Otherwise, paste the output of the following command.)Additional context Add any other context about the problem here.