Hi, great work.
I'd like to run a large scale test, but at the moment the PEnet depth completion script is a huge bottleneck, being much slower (like an order of magnitude).
Any recommendation for making this step faster while not being too painful?
Edit: I've tried using Point2VoxelGPU3d instead of Point2VoxelCPU3d (changing point_to_voxel to point_to_voxel_hash, as below). While this is a good deal faster (>2x), ~it gives quite different results (mean absolute error of about 10), so I'm not sure if I should use them. I've listed it below, Not sure if I'm using it wrong?~
I forgot that the point order would be different, sorting shows that they're very similar and performance is comparable too. Combined with vectorizing the for loop in la_sampling2, I get a pretty good speed up. Not fully real-time, but good enough for my purposes.
Unless you have a recommendation for something even better, I'd consider this solved.
ok, I see, the sampling is just for save some storage space. You can keep it or remove it. The PENet is also can be replaced by more faster ENet or FusionNet.
@DKuhse Hi, I am in a similar situation. The voxelization in la_sampling2 take nearly two seconds, which is too painful. Could you share the code about the for loop you modified?
Hi, great work. I'd like to run a large scale test, but at the moment the PEnet depth completion script is a huge bottleneck, being much slower (like an order of magnitude). Any recommendation for making this step faster while not being too painful?
Edit: I've tried using Point2VoxelGPU3d instead of Point2VoxelCPU3d (changing point_to_voxel to point_to_voxel_hash, as below). While this is a good deal faster (>2x), ~it gives quite different results (mean absolute error of about 10), so I'm not sure if I should use them. I've listed it below, Not sure if I'm using it wrong?~
I forgot that the point order would be different, sorting shows that they're very similar and performance is comparable too. Combined with vectorizing the for loop in la_sampling2, I get a pretty good speed up. Not fully real-time, but good enough for my purposes. Unless you have a recommendation for something even better, I'd consider this solved.