raulmur / ORB_SLAM2

Real-Time SLAM for Monocular, Stereo and RGB-D Cameras, with Loop Detection and Relocalization Capabilities
Other
9.37k stars 4.69k forks source link

simple feature classifier #838

Open KarimHabbab92 opened 4 years ago

KarimHabbab92 commented 4 years ago

Hello all, I am trying to create a simple features classifier by creating some counters for the map points visited by SearchByProjection functions and then store with the map through the Boost.serialization library. My questions are : 1- Is that an effective way to count ?I mean through the memory address of the visited mappoint or that memory address will be different every time we run the map? if so what is the suggested way to store such rank for every mappoint? 2- Should I serialize the object contains those counters separately from the map object of map ?

Thank you so much in advance

AlejandroSilvestri commented 4 years ago

@KarimHabbab92

Memory addresses will be different every time you run the map.

MapPoint as a mnId unique identifier. Use it instead of memory address.

And it has Tracking counters too. How many times the mappoint was found in a frame, and how many times it should have been visible, no matter if it was actually matched on image.

These are protected properties, you must work it out.

KarimHabbab92 commented 4 years ago

@AlejandroSilvestri Thank you so much, and sorry for the late response. So I could rank the map points and update this rank with the number of times of catching those map points in the scene every time I drive in the same area again. and finally, save this rank with the map for another drive in the area but the thing I am still not sure about is where to use this rank in the algorithm to distinguish the most stable map points (those with the highest rank) for relocalization and better environment condition resistance. and if there is any suggestion into applying other functions than counting the stable map points will be much appreciated. Thank you so much

AlejandroSilvestri commented 4 years ago

@KarimHabbab92

The usual uses of these two properties are:

  1. mnFound, how many times the mappoint was seen in frames. The more times the better.
  2. Ratio mnFound/mnVisible, percentage of times it was seen over every time it should have been seen. This ratio talks about the "visibility" of the point. Mappoints that can be detected from a wide angle have a high ratio. Mappoints viewable only from a certain perspective have a low ratio.

So, the first is the main criterion, the second is a complementary criterion.

abhy1012 commented 4 years ago

Hi, Where can I modify the code to rank and log the map points?

AlejandroSilvestri commented 4 years ago

@abhy1012

Two points in code to log:

LocalMapping::CreateNewMapPoints() is almost the only place where mappoints are created. The very first ones are create during initialization in other place.

MapPoint::SetBadFlag() is the place where mappoints are marked for deletion.

abhy1012 commented 4 years ago

@AlejandroSilvestri Thanks for the reply. Are these map points filtered in the Frame or Tracking functions. If so where and how?

I am trying to log the features with their id once they are decoded from the vocabulary. What is the best possible way to approach this?

AlejandroSilvestri commented 4 years ago

@abhy1012

Well, mappoints and features are quite different. Mappoints are 3D entities, they are objects and have a unique mnId property.

Features are 2D entities on Frames and KeyFrames. They aren't objects, their data is scattered accross multiple arrays, like descriptors, bow features, and so. And they don't have an Id.

Multiple features (from different KeyFrames) are bound to a single mappoint. Most of features though aren't bound to any mappoint.

Vocabulary are applied (BoW are computed) on features when a new KeyFrame is created in LocalMapping::ProcessNewKeyFrame(). So, features in KeyFrames have their respective computed BoW. Features in Frame don't, with exceptions (they are only computed for relocalization when the system is lost).

abhy1012 commented 4 years ago

@AlejandroSilvestri So if I want to log features for each keyframe and rank them based on # of occurrences, what is the best way to do it?

Since features dont have an id how do I identify them from the vocabulary. Are they decoded from the vocabulary somewhere in the source code?

AlejandroSilvestri commented 4 years ago

@abhy1012

Please explain to me what feature mean to you. Because there are more than one interpretation, and we at risk of missunderstanding each other. I'll be happy to help you.

abhy1012 commented 4 years ago

@AlejandroSilvestri

By feature I mean the pixel information that is stored in the vocabulary. I want to log the pixel information that orbslam detects for localization.

abhy1012 commented 4 years ago

I want to compare features for two different SLAM runs on the same dataset. Since the # of features is high and I cannot compare all of them I want to rank those features based on # of time they occur during the run. So I want to know how to get those feature data from vocabulary.

AlejandroSilvestri commented 4 years ago

@abhy1012

Ok, let's sync jargon. Components of a feature:

  1. kepoint, with x,y coords on image, pyramid level, orientation
  2. ORB descriptor
  3. BoW word

Each feature is detected only once. Features belong to the image.

The identity of the feature is two fold:

  1. Id of KeyFrame
  2. index to feature's arrays

The only vocabulary is the BoW vocabulary. So when you mention "vocabulary" I think you are interested in BoW words. There is no pixel information stored in the vocabulary. The vocabulary is a classification tree to label orb descriptors. This label is called a "word". It's an int.

On each frame keypoints and ORB descriptors are computed. You can get these at the end of Frame constructor.

BoW words computation is delayed, they are computed when a new KeyFrame is created.

Features are matched with mappoints. Each feature matches with only one mappoint, or none at all. Each mappoint matches many features, but no more than one per keyframe.

You can rank a mappoint based on the number of features, or on the times it was observed.

abhy1012 commented 4 years ago

@AlejandroSilvestri Thanks for the explanation. This was very helpful. Also is the mappoint matching features across different keyframes. What do you mean by "Each mappoint matches many features, but no more than one per keyframe".

abhy1012 commented 4 years ago

@AlejandroSilvestri

Can you also explain the structure of the BOW?

I know that a bag of words is represented as an ordered vector or a dictionary with the counts of words. If this is the case, then I can just add up all of the vectors for all keyframes for the dataset and then sort the vector. I am confused as to how they are structured. If you can explain that would great help.

AlejandroSilvestri commented 4 years ago

@abhy1012

One mappoint can match one feature in a keyframe, but no more than one in that keyframe. The mappoint can't be seen twice in the same image.

KeyPoint::mFeatVec is a std::map<NodeId, std::vector >

Each NodeId is associated with a single "word" in the vocabulary. mFeatVec is a map from NodeId to a vector of indices to mvKeys, the KeyFrame's KeyPoints array. All the indices in the same vector represent features associated to the same word.

abhy1012 commented 4 years ago

@AlejandroSilvestri

Thanks for the reply. If I am understanding it correctly, the numbers in the vocabulary represent features associated to the same word. Is each array corresponding to one word in the vocabulary. How do I interpret those numbers in the vocabulary?

AlejandroSilvestri commented 4 years ago

@abhy1012

for([nodeId, indices]: mFeatVec){
    // nodeId is a leaf one to one associated with a word
    for(index: indices){
        // features
        auto keypoint = mvKeys[index];
        auto descriptor = mDescriptors[index]
        auto mappointPointer = mvpMapPoints[index];  // is NULL if no mappoint for this features
        // every vector of size KeyFrame::N belongs to features and can be accessed with index
    }
}

I hope this help.

abhy1012 commented 4 years ago

Thanks @AlejandroSilvestri.

The bow for each frame or keyframe is matched with the bow in the orbvoc.txt right? If so where can I log the words from my dataset?

Which file has the code you mentiioned?

AlejandroSilvestri commented 4 years ago

@abhy1012

I don't remember the exact structure nor the functions in DBoW2, you may want to peek them.

The vocabulary is a static tree, with a word in each leaf. You already have the NodeId corresponding to a leaf, so there is a way to get to that node and get its associated word.

The code I posted is a made up example showing how to get and use nodeId and indices in KeyFrame. It's not part of ORB-SLAM2.