When calling the pulsar renderer (pytorch3d.renderer.points.pulsar.renderer) directly with batched inputs when batch size > 1. (when batch size is 1 it runs successfully)
NOTE: Please look at the existing list of Issues tagged with the label 'bug`. Only open a new issue if this bug has not already been reported. If an issue already exists, please comment there instead..
Instructions To Reproduce the Issue:
Please include the following (depending on what the issue is):
Any changes you made (git diff) or code you wrote
Below is the minimum code to reproduce the bug:
Case 1 can be successfully rendered, but case 2 get stuck (I terminated it after half an hour)
See above code.
(no logs are printed)
Please also simplify the steps as much as possible so they do not require additional resources to
run, such as a private dataset.
==
Additional notes:
I noticed the unified wrapper (pytorch3d/renderer/points/pulsar/unified.py) uses sequential rendering by using a for loop instead of directly calling the c_native code with batched inputs. However I checked the c++ code and I found it uses non-blocking calls to process batch data. So it should be faster by using batched inputs. Is there any specific reason that the unified wrapper doesn't use the batched input?
🐛 Bugs / Unexpected behaviors
When calling the pulsar renderer (pytorch3d.renderer.points.pulsar.renderer) directly with batched inputs when batch size > 1. (when batch size is 1 it runs successfully)
NOTE: Please look at the existing list of Issues tagged with the label 'bug`. Only open a new issue if this bug has not already been reported. If an issue already exists, please comment there instead..
Instructions To Reproduce the Issue:
Please include the following (depending on what the issue is):
git diff
) or code you wrote Below is the minimum code to reproduce the bug:See above code.
Case 1 can be successfully rendered, but case 2 get stuck (I terminated it after half an hour) See above code.
(no logs are printed)
Please also simplify the steps as much as possible so they do not require additional resources to run, such as a private dataset.
== Additional notes: I noticed the unified wrapper (pytorch3d/renderer/points/pulsar/unified.py) uses sequential rendering by using a for loop instead of directly calling the c_native code with batched inputs. However I checked the c++ code and I found it uses non-blocking calls to process batch data. So it should be faster by using batched inputs. Is there any specific reason that the unified wrapper doesn't use the batched input?