Closed maradanovic closed 2 years ago
Replied through email.
Thanks!
Hi,
I was hoping you could help me again with a short question.
Through testing with the pre-trained 3DMatch and modifying the voxel size, scale of kernel points/receptive field, and the first subsampling, I've found out a voxel = 9 cm performs best when registering low-quality models from different sensors.
However, this is still with the network trained with voxel = 3 cm, and I would like to test the performance of the network trained with the voxel = 9 cm on 3DMatch. Could you please tell me if there is there anything else to adjust other than:
Hi, since you have tested with different combinations of hyperparameters (e.g voxel size, first_subsampling_dl), you could just use the best one for training (so in your case, it seems you only need to set first_subsampling_dl
). BTW, I take the voxel_size equals first_subsampling_dl by default for 3DMatch, but you could use different values for them depend on your applications. Generally, voxel_size controls how sparse the input point cloud is (and also removes too-close points) and first_subsampling_dl affect the receptive field of the descriptor.
Thanks again!
Hi,
I've been having trouble running train_3DMatch with a modified first_subsampling_dl. If left as the original value (0.03), training runs smoothly, but as soon as it's modified this occurs:
`Dataset Preparation
Preparing ply files PKL file not found.
Preparing ply files
PKL file not found.
Initiating input pipelines
Traceback (most recent call last):
File "training_3DMatch.py", line 175, in
Any idea why this is happening?
You should prepare points.pkl and keypts.pkl for your own dataset https://github.com/XuyangBai/D3Feat/blob/master/datasets/ThreeDMatch.py#L108-L110 The error is raised because your self.anc_points is an empty list.
You can look this file to prepare these two files. points.pkl
saves the point cloud and keypts.pkl
saves the pre-computed correspondences between pairs using gt poses.
Makes sense, thanks! It seems I'll need 350 GB or more to download and unzip the 54 3DMatch datasets, which (of course!) I don't have at the current machine, so I'll need some time to set up.
If you don't mind, let me keep this open for now.
I do not understand why you need 3DMatch dataset. As you mentioned before, you are trying to register a dense and detailed TLS (Terrestrial Laser Scanner) point cloud with a mesh created from a depth camera, so why not prepare your own data and directly train on it, in this case, you will not have the domain gap between your training and test set?
You're correct, it would probably be best to train two networks on my own data. However, as far as I know, no public dataset exists with both TLS and depth camera data of the same indoor environments and I only have data of 1 indoor environment. Gathering enough data for training could take up to a month, which is the time I don't have.
I'll try to explain here my reasoning - please correct me if you believe I'm wrong in any point.
The pretrained 3DMatch 3 cm network appears to generalize well with cloud downsampling and modifying the receptive field both to be about 9-11 cm (which speaks highly of the generalization ability of D3Feat!). The reason why this works for my case is probably because I'm registering large models, e.g. a model of a whole room to a model of the whole floor, so I don't really need density or details.
I don't think there's a big difference between point clouds from different sensors with points being 10 cm apart, be it a TLS, depth, or any other type of point cloud of the interior. The main difference comes from the patterns specific sensors leave on specific surfaces, depending on sensor type and sensor platform used (stationary, mobile, etc.). So, with voxel sizes of 10 cm, point clouds of a room should look very similar no matter what sensor was used to gather the data.
With this in mind, I'd like to downsample 3DMatch, which appears to be the best public indoor dataset, and train D3Feat on it, and see if I get better results compared to the pretrained + modifying voxels.
Hi, I see your points. And I agree with your opinion on the differences of point cloud captured by different sensors, choosing a proper down-sampling strategy could reduce such domain gap to some extent. And I have other suggestions for your case.
Thanks for taking the time for these suggestions!
Hi,
First, thanks for this amazing work and making it open source!
There is something I wanted to ask your advice on.
I would like to use D3Feat for registration of indoor models from different sensors. Specifically, I’m trying to register a dense and detailed TLS (Terrestrial Laser Scanner) point cloud with a mesh created from a depth camera (mesh is converted to point cloud by either taking the vertices or by randomly sampling the mesh, but in both cases these are not as detailed as TLS point clouds). I’ve had some success by using the 3DMatch pretrained network and improved the generalisation by increasing the voxel size, scale of kernel points/receptive field, and the first subsampling to 10 cm, but I’m trying to figure out is there a better angle of approach.
I was thinking about (i) training the network with two different sets of data at once, by combining 3DMatch (depth) and TLS data, and (ii) training two different networks on two different datasets and dealing with the domain adaptation later (with a third network), but neither looks like a good solution to me (ofc might be wrong here).
I know this is a broad question, but how would you approach this problem?
It would be great to hear your opinion!