How can I train an incremental BoW vocabulary offline?

gvasserm commented 5 months ago

Hi, I'm trying to replace incremental BoW with fixed vocabulary trained offline on subset of mapping data. Is it possible to train it correctly offline? I'm currently using the following code for training however the similarity scores are no qiute similar to scores that I recieve in online training in mapping:

void train_incremental(
    const std::vector<std::string> &dataset_files,
    const std::string &fileNameReferences,
    const std::string &fileNameDescriptors,
    bool detect_describe=true)
{

    ParametersMap params;
    Rtabmap * rtabmap = new Rtabmap();
    rtabmap->init(params);
    Memory* memory = rtabmap->getMemoryC();
    VWDictionary* vwd = memory->getVWDictionaryC(); 

    std::map<int, std::list<int>> wordIds;
    std::map<int, float> wordC;
    std::vector<cv::Mat> features_all(dataset_files.size());

    vwd->setIncrementalDictionary();

    int id = 0;
    size_t N = dataset_files.size();
    for (auto i : tqdm::range(N))
    {
        cv::Mat features;
        std::string f = dataset_files[i];
        if(detect_describe){
            cv::Ptr<cv::ORB> orb = cv::ORB::create(2000);
            cv::Mat im = cv::imread(f);
            std::vector<cv::KeyPoint> keypoints;
            orb->detect(im, keypoints);
            orb->compute(im, keypoints, features);
        }
        else{
            features = load_descriptors(f);
        }
        wordIds[id] = vwd->addNewWords(features, id);
        vwd->update();
        wordC[id] = wordIds[id].size();
        features_all[id] = features;
        id++;
    }

    std::cout << "Number of words:" << vwd->getIndexedWordsCount() << std::endl;
    vwd->exportDictionary(&fileNameReferences[0], &fileNameDescriptors[0]);
    rtabmap->close(false);
    return;
}

Thanks in advance for any help)

matlabbe commented 5 months ago

Make sure to launch rtabmap with Kp/DetectorStrategy=2 to use also ORB when you use that dictionary afterwards.

Can you compare with a dictionary created with:

rtabmap-console --Kp/DetectorStrategy 2 --Kp/MaxFeatures 2000 my_dataset_folder

Maybe the high number (2000) of features per image affects the quantization.

gvasserm commented 5 months ago

Make sure to launch rtabmap with Kp/DetectorStrategy=2 to use also ORB when you use that dictionary afterwards.

Can you compare with a dictionary created with:
rtabmap-console --Kp/DetectorStrategy 2 --Kp/MaxFeatures 2000 my_dataset_folder
Maybe the high number (2000) of features per image affects the quantization.

Hi, thank you for reply. I'm currently using GFTT (--Kp/DetectorStrategy 8) and training the dictionary with saved descriptors from online run. When I've compared the dictionaries build online (during mapping) and offline (with saved descriptors) it appears that in order to receive the number of loop closures comparable to online trained dictionary you need to remove the duplicate frames from the dataset. This is actually a bit tricky since you need to define dupicate, currently I'm using cosplace solution (global descriptor) for similarity calculation between frames and then clustering the frames by similarity threshold.

matlabbe commented 5 months ago

During online mapping, the similarity is estimated using this function: https://github.com/introlab/rtabmap/blob/0f03db9d7f720dd6b7d0018f4c3a8719ae4f3afa/corelib/src/Signature.cpp#L250-L287

Then if consecutive images similarity is over Mem/RehearsalSimilarity, it will be discarded. Same thing if the robot is not moving, images are discarded following RGBD/LinearUpdate and RGBD/AngularUpdate parameters.

introlab / rtabmap

How can I train an incremental BoW vocabulary offline? #1289