dansim-umich / ORB_SLAM2_Bilateral_LoFTR

ROB 530 Final Project
Other
28 stars 3 forks source link

How to train the vocabulary with LoFTR features #2

Open Adamqu199541 opened 1 year ago

Adamqu199541 commented 1 year ago

Thanks for providing such wonderful work. We are curious about how to conduct bag of words training and can you provide training details?

Aaatresh commented 1 year ago

Bag of words (BoW) training consists of extracting feature vectors and storing this data in a bag of words model, which consists of direct and inverse indexing between images and their feature vectors. The information is stored in a database as a hierarchical tree structure using the DBoW2 framework. For any general image feature and following the approach taken in this project, training consists of extracting features and storing them in a database through DBoW2.

In our work, we extract LoFTR feature vectors, which are a left and right feature descriptors when considering stereo images. Each feature vector (i.e. left and right) is a concatenation of coarse and fine descriptors and has a size of 384 x 1. The left LoFTR feature vector is stored in the database and the right LoFTR feature vector is discarded in the training process. In DBoW2, the branching factor was set to 10 and the number of depth levels was set to 6.

Adamqu199541 commented 1 year ago

Thank you for your reply. In my experiment, I find that DBow2 only supports binary feature descriptors, while deep features such as superpoint and LoFTR are float descriptors. Is there any difference between your vocabulary training process and the standard training process? And I also try use DBow3, which supports float feature descriptors, for vocabulary training. My process is almost consistent with what you said: use the deep local feature detector to extract float descriptors, and then use the DBow3 standard training code to train vocabulary. However, I find that the trained vocabulary does not work in ORB-SLAM2. Are there any tricks in your vocabulary training? Could you please provide your vocabulary training code in your repository?

Aaatresh commented 1 year ago

Our integration of LoFTR into the ORB-SLAM2 framework did not show good performance when using the DBoW2 framework. It is possible that the mismatch in feature type could be the root cause of this problem, that is, a mismatch between float and binary descriptors. However, to be sure that is the root cause of the problem, more experimentation and examination is needed. Replacing DBoW2 with DBoW3 would be a good starting point.

There is no difference between the training performed in our framework and the standard training process.

What do you mean by: "However, I find that the trained vocabulary does not work in ORB-SLAM2"? Is there difficulty building code or is ORB-SLAM2 with this integration not performing well?

Here is a link to our vocabulary training. Apologies for not including it in this repository. Depending on the feature vector size, please modify FORB.cpp by changing variable 'L'.