facebookresearch / pytorch3d

PyTorch3D is FAIR's library of reusable components for deep learning with 3D data
https://pytorch3d.org/
Other
8.81k stars 1.32k forks source link

Pulsar Batch rendering get stuck #1600

Open jiaming-ai opened 1 year ago

jiaming-ai commented 1 year ago

🐛 Bugs / Unexpected behaviors

When calling the pulsar renderer (pytorch3d.renderer.points.pulsar.renderer) directly with batched inputs when batch size > 1. (when batch size is 1 it runs successfully)

NOTE: Please look at the existing list of Issues tagged with the label 'bug`. Only open a new issue if this bug has not already been reported. If an issue already exists, please comment there instead..

Instructions To Reproduce the Issue:

Please include the following (depending on what the issue is):

  1. Any changes you made (git diff) or code you wrote Below is the minimum code to reproduce the bug:
from pytorch3d.renderer.points.pulsar.renderer import Renderer as PulsarRenderer
import torch
device = torch.device("cuda:3")
render = PulsarRenderer(
    width=480,
    height=640,
    max_num_balls=1000000,
    orthogonal_projection=False,
    right_handed_system=False,
    n_channels=1,
).to(device)
vert_pos = torch.tensor([[[-0.6250,  0.2250,  0.8250],
                        [-0.6250,  0.2750,  0.8250],
                        [-0.5750,  0.2250,  0.8250]],
                        ]).to(device) # [1, 3, 3]
vert_feat = torch.tensor([[[1.],
                        [1.],
                        [1.]]]
                       ).to(device) # [1, 3, 1]
vert_rad = torch.tensor([[0.0329, 0.0322, 0.0324],
                        ]).to(device) # [1, 3]
cam_params = torch.tensor([[ 0.,  1.0, -2.9802e-08, -1.0,  5.5507e-17,
                        0, -0.0000e+00, -8.6604e-01, -4.9997e-01,  1.0-1e-6,
                        7.6773e-01,  0.0000e+00,  0.0000e+00],
                        ],).to(device) # [1, 13]
gamma = 1e-3
zfar = 100.0
znear = 1.0

# Case 1: let's try batch size = 1
ret1 = render(
    vert_pos=vert_pos,
    vert_col=vert_feat,
    vert_rad=vert_rad,
    cam_params=cam_params,
    gamma=gamma,
    max_depth=zfar,
    min_depth=znear,
)
# ret1 can be successfully rendered!

# Case 2: let's try batch size = 2
vert_pos = vert_pos.expand(2, -1, -1) # [2, 3, 3]
vert_feat = vert_feat.expand(2, -1, -1) # [2, 3, 1]
vert_rad = vert_rad.expand(2, -1) # [2, 3]
cam_params = cam_params.expand(2, -1) # [2, 13]
ret2 = render(
    vert_pos=vert_pos,
    vert_col=vert_feat,
    vert_rad=vert_rad,
    cam_params=cam_params,
    gamma=gamma,
    max_depth=zfar,
    min_depth=znear,
)
# the process will get stuck here, while CPU utilization for that python process is 100%
  1. The exact command(s) you ran:

See above code.

  1. What you observed (including the full logs):

Case 1 can be successfully rendered, but case 2 get stuck (I terminated it after half an hour) See above code.

(no logs are printed)

Please also simplify the steps as much as possible so they do not require additional resources to run, such as a private dataset.

== Additional notes: I noticed the unified wrapper (pytorch3d/renderer/points/pulsar/unified.py) uses sequential rendering by using a for loop instead of directly calling the c_native code with batched inputs. However I checked the c++ code and I found it uses non-blocking calls to process batch data. So it should be faster by using batched inputs. Is there any specific reason that the unified wrapper doesn't use the batched input?

bottler commented 1 year ago

What happens with a much smaller value of max_num_balls? I wonder if some resource is running out.