The transform method of the RandLANet class is currently creating copies of the input data before sampling points. This significantly impacts performance, especially when dealing with large pointcloud files containing millions of points.
Move the copy() operations to after the point sampling. The point sampler doesn't perform any in-place operations on the pc array, so it shouldn't have any side effects on the original pointcloud.
pc = data["point"] # full pointcloud (N,3)
label = data["label"]
feat = data["feat"] if data["feat"] is not None else None
tree = data["search_tree"]
selected_idxs, center_point = self.trans_point_sampler(
pc=pc,
feat=feat,
label=label,
search_tree=tree,
num_points=self.cfg.num_points,
sampler=self.cfg.get("sampler", None),
) # Points are sampled from the whole pointcloud (n_points,3)
pc_sub = pc[selected_idxs]
pc = pc_sub.copy()
label_sub = label[selected_idxs]
label = label_sub.copy()
if feat is not None:
feat_sub = feat[selected_idxs]
feat = feat_sub.copy()
random.shuffle(idxs)
return idxs, center_point
References
No response
Additional information
Performance Improvement
I have run performance tests on both the current and proposed implementations, running a single epoch with this configuration ml3d/configs/randlanet_toronto3d.yml .
Checklist
main
branch).Proposed new feature or change
Current Behavior
The
transform
method of theRandLANet
class is currently creating copies of the input data before sampling points. This significantly impacts performance, especially when dealing with large pointcloud files containing millions of points.https://github.com/isl-org/Open3D-ML/blob/473592d6abf2492f906495f7eb949e52c44fe6ae/ml3d/torch/models/randlanet.py#L170-L185
https://github.com/isl-org/Open3D-ML/blob/473592d6abf2492f906495f7eb949e52c44fe6ae/ml3d/datasets/samplers/semseg_random.py#L51-L53
Proposed Change
Move the
copy()
operations to after the point sampling. The point sampler doesn't perform any in-place operations on thepc
array, so it shouldn't have any side effects on the original pointcloud.References
No response
Additional information
Performance Improvement
I have run performance tests on both the current and proposed implementations, running a single epoch with this configuration ml3d/configs/randlanet_toronto3d.yml .