Closed KexuanXia closed 5 months ago
Hi,
thank you for your interest in our work!
In the demo code, we used N=6000
to make the importance of each point even more prominent. However, the same gain in information about the important points can be obtained with significantly fewer iterations.
To determine the mean voxel density for a new dataset, the following steps should be performed on several hundred LiDAR frames (should be efficient for convergence of the fitted function):
spconv
.scipy.spatial.distance.cdist
.Finally, the distance range to cover is divided into 0.5m steps and the mean voxel density is calculated for each bin. These data points are then used to fit a function of the form f(distance) = 1 / (a * distance^2 + b * distance + c)
using scipy.optimize.curve_fit
.
The sampling probability, i.e. the probability that a voxel is not masked, is then computed as lambda * (1/f(distance))
.
The specific value of lambda
has been empirically determined to ensure that the average similarity score is about 0.3
, as shown in Figure 8 of the paper.
In total, for a new dataset, the function with which the voxel density decreases with distance must be approximated (since it depends on the LiDAR sensor characteristics) and the parameter lambda
(depending on the detector) must be set such that the detections begin to degrade due to the masking.
I hope this helps.
With best regards David
Hi,
I would like to sincerely thank you for your detailed and patient answer!
I believe it's really meaningful for me to reproduce the determination of the mean voxel density step by step on KITTI. If the result keeps the same as yours, I am going to do it again on a new dataset. Unfortunately, I am stuck in the first step.
I take as input a point cloud "000001.bin" form KITTI, use the function PointToVoxel
from spconv
to voxelize it.
# read point cloud and drop the 4th column since intensity is not necessary for voxelization
source_file_path = 'demo_data/000001.bin'
if source_file_path.split('.')[-1] == 'bin':
points = np.fromfile(source_file_path, dtype=np.float32)
points = points.reshape(-1, 4)[:, :3]
elif source_file_path.split('.')[-1] == 'npy':
points = np.load(source_file_path)[:, :3]
else:
raise NotImplementedError
# number of total points
print("number of points: ", points.shape[0])
# transfer np into torch
device = torch.device("cuda:0") if torch.cuda.is_available() else torch.device("cpu")
points = torch.from_numpy(points).float().to(device)
# initialize PointToVoxel
point_to_voxel = PointToVoxel(
vsize_xyz=[0.2, 0.2, 0.2], # voxel size
coors_range_xyz=[-80, -80, -10, 80, 80, 50], # coordinate ranges
num_point_features=3, # number of point features
max_num_voxels=20000, # maximum voxels, the same as the setting of your work
max_num_points_per_voxel=200, # maximum points in each voxel, the same as the setting of your work
device=torch.device("cuda:0") # GPU
)
voxels, indices, num_points_per_voxel = point_to_voxel(points)
I have two questions about the output num_points_per_voxel
:
122k
but only 71k
after voxelization. If I set max_num_voxels=40000
, the problem disappeared. But it should be max_num_voxels=20000
according to the config file kitti_pointpillar.yaml
in your work.num_points_per_voxel
, but my output showed they were all non-zero in num_points_per_voxel
. It really confuses me.
number of points: 122109
Number of Points after Voxelization: 71198
Number of zero voxels: 0
Thank you for reading this far. Wish you have a good day.
Best Regards, Kexuan Xia
Hi,
Yes, that is correct, the number of points before and after voxelisation must be the same, i.e. the sum of num_points_per_voxel
.
A few comments on this. First, it seems that you did not crop the point cloud according to the camera frustum as done usually in the pcdet dataloader, therefore the maximum number of voxel is too small in this case. Second, I use the CPU implementation of spconv like pcdet, because I had some problems with the GPU implementation. And third, you can alternatively use the generate_voxel_with_id
function (https://github.com/traveller59/spconv/blob/2.1.x/spconv/pytorch/utils.py) to check where the points are lost.
Regarding the second question: Yes, the majority of voxels should be empty. However, you can also use the number of all possible voxels within the radius to compute the density.
Best regards David
Hi,
Thank you so much for your thorough and considerate response!
I did not crop the point cloud, I will do it to see if this is the only cause for the problem. Could you please tell me why 42 degrees is used to crop the point cloud? Is it because the same degrees were used for getting the images in KITTI?
I wish you have a wonderful week.
Best regards Kexuan Xia
Hi,
the 42 degrees is just an approximation to make the demo work without the corresponding KITTI info files.
However, this is not a prerequisite for determining the voxel density, you just need to increase the number of maximum voxels in the voxel generator, as you already noted.
Best regards David
Hi,
First, I would like to express my appreciation for your impressive work. I have been exploring your paper and the associated code and have a few questions that I hope you could help clarify.
In the paper, you mentioned that setting N=3000 iterations strikes a balance between runtime and the quality of the attribution maps. However, I noticed that the demo is set at 6000 iterations. Could you help me understand the reason for this difference?
Additionally, I am keen on adapting your approach for the nuScenes dataset. As a preliminary step, I'm considering adjusting the parameters a, b, c, and lambda in the OccAM model configuration. I've seen that there's already a discussion on this topic under the thread "Questions about mean voxel density #3". In your response, you mentioned using a few hundred samples to determine the mean voxel density. Could you share more about how this was achieved? Did you utilize any open-source libraries or was it an entirely custom development?
Besides modifying the values of a, b, c, and lambda, are there other adjustments that are essential for successfully applying this method to the nuScenes dataset?
Thank you for your time and assistance. I look forward to your insights.
Best regard, Kexuan Xia