YungeCui / BoW3D

[RA-L] BoW3D: Bag of Words for Real-Time Loop Closing in 3D LiDAR SLAM.
277 stars 39 forks source link

Clarification on "Words" and Word Comparisons #11

Open DanMcGann opened 7 months ago

DanMcGann commented 7 months ago

Hello! Firstly, I am very interested in your code and corresponding paper. After reading both, however, I have a few questions I am hoping to get help with.

Q1: What is a "word"?

In the paper and the code a LinK3D Descriptor is a $N\times180$ dimensional matrix, where each row is the relative distances between a feature point and its nearest other feature points (within each sector) oriented around its globally closest neighbor.

By all available information it appears that a row of this Descriptor $1\times180$ are used as "words" in the BOW system proposed.

Is this understanding correct?

Q2: How are words compared?

If the above description of a word is correct then a word is a 180 dimensional vector of distances. Where in the code distances are stored as 32 bit float. How are words compared to identify if multiple scans contain the same words?

The code appears to simply compare words with direct equality, however, this could not possibly give a good result. Due to noise from real-world data + affects from clustering in the LinK3D generation I would never expect corresponding features to equal exactly to a 32 bit precision. Therefore I must be missing some rounding / bucketing in the code / paper and would greatly appreciate clarification.

YungeCui commented 6 months ago

Thank you for your interest in my work. Q1. A word in BoW3D consists of two parts: The first is the dimension value in LinK3D descriptor (i.e., the relative distance). The second is the dimension ID of the corresponding dimension value in LinK3D descriptor (which is a value between 0 and 180).

Q2. If multiple places contain the same word, the scores of these places also add 1. If the score of a place is higher than th_f, the place will be used for further matching verification.

DanMcGann commented 6 months ago

Hello! Thank you for your quick response!

Given your answer a "word" may be something like [1.4567, 10] indicating that in a lidar frame there is some feature that has its nearest neighbor in the 10th rotational bucket at a distance of 1.4567 m.

I am still confused about how words are compared because I would expect a matching scan to maybe contain a word of [1.4572, 10] that corresponds to that above. However, because of noise in the lidar scan + clustering process their distances are different and simply evaluating direct equality would not give us good results.

To put it more simply how would words [1.4567, 10] and [1.4572, 10] be compared for equality?

YungeCui commented 6 months ago

Each non-zero dimension in LinK3D is approximated to one decimal place.

DanMcGann commented 6 months ago

Perfect! This is exactly the clarification I needed! Thanks for the help!