Open NJUSTzwh opened 1 year ago
Your output info make me confused, your NX even slower than my jetson nano(4GB), and it should not be. The output info of my jetson nano as follows:
./demo
GPU has cuda devices: 1
----device id: 0 info----
GPU : NVIDIA Tegra X1
Capbility: 5.3
Global memory: 3956MB
Const memory: 64KB
SM in a block: 48KB
warp size: 32
threads in a block: 1024
block dim: (1024,1024,64)
grid dim: (2147483647,65535,65535)
------------checking CUDA ----------------
CUDA Loaded 119978 data points from PCD file with the following fields: x y z
------------checking CUDA PassThrough ----------------
CUDA PassThrough by Time: 1.9844 ms.
CUDA PassThrough before filtering: 119978
CUDA PassThrough after filtering: 5110
------------checking CUDA VoxelGrid----------------
CUDA VoxelGrid by Time: 35.325 ms.
CUDA VoxelGrid before filtering: 119978
CUDA VoxelGrid after filtering: 3440
------------checking PCL ----------------
PCL(CPU) Loaded 119978 data points from PCD file with the following fields: x y z
------------checking PCL(CPU) PassThrough ----------------
PCL(CPU) PassThrough by Time: 9.47348 ms.
PointCloud before filtering: 119978 data points (x y z).
PointCloud after filtering: 5110 data points (x y z).
------------checking PCL VoxelGrid----------------
PCL VoxelGrid by Time: 24.2884 ms.
PointCloud before filtering: 119978 data points (x y z).
PointCloud after filtering: 3440 data points (x y z).
And when I run the jetson clocks, it will be faster, the output info as follows:
./demo
GPU has cuda devices: 1
----device id: 0 info----
GPU : NVIDIA Tegra X1
Capbility: 5.3
Global memory: 3956MB
Const memory: 64KB
SM in a block: 48KB
warp size: 32
threads in a block: 1024
block dim: (1024,1024,64)
grid dim: (2147483647,65535,65535)
------------checking CUDA ----------------
CUDA Loaded 119978 data points from PCD file with the following fields: x y z
------------checking CUDA PassThrough ----------------
CUDA PassThrough by Time: 1.39955 ms.
CUDA PassThrough before filtering: 119978
CUDA PassThrough after filtering: 5110
------------checking CUDA VoxelGrid----------------
CUDA VoxelGrid by Time: 11.9661 ms.
CUDA VoxelGrid before filtering: 119978
CUDA VoxelGrid after filtering: 3440
------------checking PCL ----------------
PCL(CPU) Loaded 119978 data points from PCD file with the following fields: x y z
------------checking PCL(CPU) PassThrough ----------------
PCL(CPU) PassThrough by Time: 3.32619 ms.
PointCloud before filtering: 119978 data points (x y z).
PointCloud after filtering: 5110 data points (x y z).
------------checking PCL VoxelGrid----------------
PCL VoxelGrid by Time: 16.5497 ms.
PointCloud before filtering: 119978 data points (x y z).
PointCloud after filtering: 3440 data points (x y z).
Finally, I don't know why cuda-pcl in PassThrough is better than pcl but in VoxelGrid is not well, but I think maybe that's why pcl remove the cuda support of voxelgrid in pcl-1.13.1.
@MagicalBrain hello,I want to ask for advice. Running machine environment:
When I use the official cuFilter demo, the cuda calculation time is basically the same as the official one. As follows: ------------checking CUDA VoxelGrid---------------- CUDA VoxelGrid by Time: 3.20768 ms. CUDA VoxelGrid before filtering: 119978 CUDA VoxelGrid after filtering: 3440
But when I try to set setP.voxelX, setP.voxelY, and setP.voxelZ to 0.09, the cuda calculation time is much slower, which is not as expected. As follows: ------------checking CUDA VoxelGrid---------------- CUDA VoxelGrid by Time: 3109.65 ms. CUDA VoxelGrid before filtering: 119978 CUDA VoxelGrid after filtering: 62844
Why is this? Is there any way to solve this situation? In most cases, setP.voxelX, setP.voxelY, and setP.voxelZ cannot always be set to 1. I hope someone can help.
cuda-pcl in PassThrough is better than pcl but in VoxelGrid is not well