mihaidusmanu / d2-net

D2-Net: A Trainable CNN for Joint Description and Detection of Local Features
Other
760 stars 163 forks source link

How to generate the dataset for retained the network(our data include infared images and RGB images) #60

Closed zxp771 closed 4 years ago

zxp771 commented 4 years ago

Hi Thanks for your perfect work and sharing. I have read the introduction you shared but have some problem need your help. The introduction said you trained your network from megadepth dataset I'm wondering how to retrain the network with our dataset? (As I mentioned on the title our data include RGB/infrared images.) I found maybe we need to use the "magedepth_utils" but I can't find anywhere to read our data to generate the same format dataset which saved in "preprocess_scene.py".

mihaidusmanu commented 4 years ago

Hello. We only trained our network on MegaDepth so all the scripts in megadepth_utils as well as the data loader lib/dataset.py only work with the dataset structure of Megadepth.

If you wish to re-train D2-Net on a different dataset, you will need to write your own dataset interface replacing lib/dataset.py. For each image, you will need the following information: RGB + depth and camera poses + intrinsics. If you have all this information, all you need to do is sample image pairs with sufficient overlap and return all the information from above in a dictionary similar to:

https://github.com/mihaidusmanu/d2-net/blob/2a4d88fbe84961a3a17c46adb6d16a94b87020c5/lib/dataset.py#L228-L239

The rest of the code should work without any changes.

zxp771 commented 4 years ago

Hi @mihaidusmanu sorry to disturb you again. I'm wondering how to get the depth information? I know you used the Megadepth images but they all 2D-images. Do you use the CLOMAP's dense reconstruction to get the depth information?

mihaidusmanu commented 4 years ago

MegaDepth also provides depth maps using COLMAP MVS + some refinement. For more details please refer to the MegaDepth website / paper at https://research.cs.cornell.edu/megadepth/.

zxp771 commented 4 years ago

Hi @mihaidusmanu You mean we can generate the depth information by COLMAP MVS? I read their paper before. For our case, We want to use the different source images to do the reconstruction.(infrard + RGB) We already tried the images on COLMAP MVS directly. No surprise, the COLMAP can not find any matches between infrard/RGB images(I think it caused by the SIFTGPU algorithm which can not match the different source images) but We find that your d2-net can catch some matches so we tried the method on https://github.com/tsattler/visuallocalizationbenchmark/tree/master/local_feature_evaluation which can put your d2-net features and descriptors into COLMAP's pipeline. For now, we haven't got any good reconstruction results yet. So we think if we can do the fine-tuning with your pre-train model on our different source images. The most difficult part is to get the depth information and even we used your pre-train model we still can't get a perfect reconstruction result. Could you give us some suggestions?

zxp771 commented 4 years ago

Hi @mihaidusmanu Sorry to disturb you again. I wondering that if the depths( binary files )generated by COLMAP can be used for training your D2-net? I read the introduction of Megadepth v1 dataset. They said:"depths" folders include hdf5 files, each of which corresponds to depth maps, saved as 32bit floating-point format, estimated from MVS and all the depths were cleaned based on the methods we described in the paper so that they have significantly fewer outliers than original raw depth maps from COLMAP." I was confused that do all the depths in "Megadepth v1 dataset" were generated by Megadepth's work or generated by the COLMAP(I know they have modified somewhere in COLMAP)? I'm wondering if it just the difference in data format(binary files and hdf5 files) or the data is totally different. The training depth information your work used should be hdf5 files?

mihaidusmanu commented 4 years ago

Hello. Yes, any depth maps that are consistent (in the same unit) for a given 3D model should work for training. To train D2-Net using the provided code, you need to be able to warp points from a training image to a different one overlooking the same structure, i.e., you need images + depth maps (not necessarily very dense) + intrinsics + poses. However, as I said above, the current implementation only supports the MegaDepth dataset for training so you will need to implement the dataset class yourself (or set up your dataset in the same way as MegaDepth and modify the pre-processing scripts accordingly).

The MegaDepth depth maps were initially generated using COLMAP and then post-processed using the protocol described in their paper. While COLMAP depth-maps will also work, the depth-maps released with MegaDepth are cleaner (less noise, no depth on moving objects if I recall correctly).

ericzzj1989 commented 2 years ago

Hi @mihaidusmanu Sorry to disturb you again. I wondering that if the depths( binary files )generated by COLMAP can be used for training your D2-net? I read the introduction of Megadepth v1 dataset. They said:"depths" folders include hdf5 files, each of which corresponds to depth maps, saved as 32bit floating-point format, estimated from MVS and all the depths were cleaned based on the methods we described in the paper so that they have significantly fewer outliers than original raw depth maps from COLMAP." I was confused that do all the depths in "Megadepth v1 dataset" were generated by Megadepth's work or generated by the COLMAP(I know they have modified somewhere in COLMAP)? I'm wondering if it just the difference in data format(binary files and hdf5 files) or the data is totally different. The training depth information your work used should be hdf5 files? @zxp771 Hello, have you addressed this issue about depth data?