isl-org / Open3D-ML

An extension of Open3D to address 3D Machine Learning tasks
Other
1.83k stars 313 forks source link

ContinuousConv on CUDA returns 0.0 (Open3D 0.18) #646

Open iSach opened 6 months ago

iSach commented 6 months ago

Checklist

Describe the issue

On 0.17, CPU and GPU implementations for FixedRadiusSearch return the same result, which make CConv work as intented.

On 0.18, CPU works but GPU returns empty results. This makes CConv return 0.0 when using CUDA.

Steps to reproduce the bug

import torch
import open3d.ml.torch as t3d

torch.set_default_device('cuda')

inp_positions = torch.randn([20,3])
inp_features = torch.randn([20,4])
out_positions = torch.randn([10,3])

conv = t3d.layers.ContinuousConv(
    in_channels=4,
    filters=4,
    kernel_size=[2,2,2],
)

res = conv(inp_features, inp_positions, out_positions, extents=2.0)

res_cuda = conv.cuda()(inp_features.cuda(), inp_positions.cuda(), out_positions.cuda(), extents=2.0)

resmin, resmax = res.min().item(), res.max().item()#, res_cuda.min().item(), res_cuda.max().item()

print(f"{resmin=}\n{resmax=}\n{rescmin=}\n{rescmax=}")

Error message

resmin=<different from 0> resmax=<different from 0> rescmin=0.0 rescmax=0.0

Expected behavior

As with Open3D 0.17, FixedRadiusSearch should work properly on CUDA.

Open3D, Python and System information

- Operating system: Ubuntu 22.04
- Python version: 3.10
- Open3D version: 0.18
- System type: x86
- Is this remote workstation?: yes
- How did you install Open3D?: pip (clean conda environment, only Torch & Open3D)

Additional information

No response

iSach commented 6 months ago

The issue more precisely seems to come from build_spatial_hash_table.

Given the following point cloud and radius:

import torch
import open3d.ml.torch as ml3d

points = torch.Tensor([
  [0.1,0.1,0.1],
  [0.5,0.5,0.5],
  [1.7,1.7,1.7],
  [1.8,1.8,1.8],
  [0.3,2.4,1.4]])

radius = 1.0

The respective codes return:

table = ml3d.ops.build_spatial_hash_table(points,
                                          radius,
                                          points_row_splits=torch.LongTensor([0,5]),
                                          hash_table_size_factor=1/64)

build_spatial_hash_table(hash_table_index=tensor([0, 1, 2, 3, 4], dtype=torch.int32), hash_table_cell_splits=tensor([0, 5], dtype=torch.int32), hash_table_splits=tensor([0, 1], dtype=torch.int32))

and on CUDA

table = ml3d.ops.build_spatial_hash_table(points.cuda(),
                                          radius,
                                          points_row_splits=torch.LongTensor([0,5]),
                                          hash_table_size_factor=1/64)

build_spatial_hash_table(hash_table_index=tensor([0, 0, 0, 0, 0], device='cuda:0', dtype=torch.int32), hash_table_cell_splits=tensor([0, 0], device='cuda:0', dtype=torch.int32), hash_table_splits=tensor([0, 1], dtype=torch.int32))

It even sometimes return things like build_spatial_hash_table(hash_table_index=tensor([1065353216, 1065353216, 1065353216, 1056964608, 1073741824], device='cuda:0', dtype=torch.int32), hash_table_cell_splits=tensor([0, 0], device='cuda:0', dtype=torch.int32), hash_table_splits=tensor([0, 1], dtype=torch.int32)) which obviously cause overflow issues if used later in (fixed) radius search for ContinuousConv.