LiuLimingCode / HFNet_SLAM

HFNet-SLAM: An accurate and real-time monocular SLAM system with deep features
79 stars 15 forks source link

float 16 model? #3

Closed antithing closed 1 year ago

antithing commented 1 year ago

Hi,. and thank you for making this code available.

I am running the test extraction example, and i see:

================= HFNet TensorRT Float 16: kImageToLocalAndGlobal =====================
Evaluate the run time perfomance in dataset: 

is faster than:

================= HFNet TensorRT Float 32: kImageToLocalAndGlobal =====================
Evaluate the run time perfomance in dataset: 

However there is only the included HF-Net.onnx model, which seems to be float 32. Can you please share the faster float 16 model for tensorRT?

I also see this message when loading:

WARNING: onnx2trt_utils.cpp:377: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.

Thank you!

antithing commented 1 year ago

...I also find that the SLAM fails on euroc Monocular datasets. (TensorRT, MH_01).

I see the following:

Successfully loaded HFNet TensorRT model. Mode: ImageToLocal Shape: [1, 400, 627, 1]
Initialization of Atlas from scratch
Creation of new map with id: 0
Creation of new map with last KF id: 0
There are 1 cameras in the atlas
Camera 0 is pinhole
First KF:0; Map init KF:0
New Map created with 867 points
Init frame id: 0
Starting the Viewer
Fail to track local map!
Fail to track local map!
Fail to track local map!
Fail to track local map!
Fail to track local map!
Fail to track local map!
Fail to track local map!
Fail to track local map!
Fail to track local map!
Fail to track local map!
Fail to track local map!
Fail to track local map!
Fail to track local map!
Fail to track local map!
Fail to track local map!
Fail to track local map!
Fail to track local map!
Fail to track local map!
Fail to track local map!
Fail to track local map!
Fail to track local map!
Fail to track local map!
Fail to track local map!
Fail to track local map!
Fail to track local map!
Fail to track local map!
Fail to track local map!
Fail to track local map!
Fail to track local map!

However, if I use 'Step by Step' and step through the frames, tracking succeeds. What might be happening here? Is there anything I can look at to resolve this?

Thanks!

LiuLimingCode commented 1 year ago

Hello. For the HF-Net model, the ONNX model is indeed of the float 32 type. However, this model will be further optimised in the TensorRT engine. As you can see in 'HFNetRTModel.cc:230'. The TensorRT build flag will decide the running precision of the HFNet model in the TensorRT engine, and float 16 is the default setting.

LiuLimingCode commented 1 year ago

For your failure in the EuRoC dataset, I am afraid I cannot provide useful suggestions based on your information. I don't understand what you mean by "step by step". Does it mean running the program with a debugger? I would like to share the output of the program when successful running in the EuRoC dataset with Mono configuration.

llm@llm-ubuntu-20:~/ROS/HFNet_SLAM$ pathDataset='/media/llm/Datasets/EuRoC/' # it is necesary to change it by the dataset path
llm@llm-ubuntu-20:~/ROS/HFNet_SLAM$ pathEvaluation='./evaluation/Euroc/'
llm@llm-ubuntu-20:~/ROS/HFNet_SLAM$ sequenceName='MH01'
llm@llm-ubuntu-20:~/ROS/HFNet_SLAM$ ./Examples/Monocular/mono_euroc ./Examples/Monocular/EuRoC.yaml "$pathEvaluation"/"$sequenceName"_MONO/ "$pathDataset"/"$sequenceName" ./Examples/Monocular/EuRoC_TimeStamps/"$sequenceName".txt
num_seq = 1
settings path: ./Examples/Monocular/EuRoC.yaml
result save path: ./evaluation/Euroc//MH01_MONO/
Loading images for sequence 0...LOADED!

-------

ORB-SLAM3 Copyright (C) 2017-2020 Carlos Campos, Richard Elvira, Juan J. Gómez, José M.M. Montiel and Juan D. Tardós, University of Zaragoza.
ORB-SLAM2 Copyright (C) 2014-2016 Raúl Mur-Artal, José M.M. Montiel and Juan D. Tardós, University of Zaragoza.
This program comes with ABSOLUTELY NO WARRANTY;
This is free software, and you are welcome to redistribute it
under certain conditions. See LICENSE.txt.

Input sensor was set to: Monocular
Loading settings from ./Examples/Monocular/EuRoC.yaml
Camera1.k3 optional parameter does not exist...
    -Loaded camera 1
Camera.newHeight optional parameter does not exist...
Camera.newWidth optional parameter does not exist...
    -Loaded image info
    -Loaded ORB settings
Viewer.imageViewScale optional parameter does not exist...
    -Loaded viewer settings
System.LoadAtlasFromFile optional parameter does not exist...
System.SaveAtlasToFile optional parameter does not exist...
    -Loaded Atlas settings
System.thFarPoints optional parameter does not exist...
    -Loaded misc parameters
----------------------------------
SLAM settings: 
    -Camera 1 parameters (Pinhole): [ 458.65399169921875 457.29598999023438 367.21499633789062 248.375 ]
    -Camera 1 distortion parameters: [  -0.28340810537338257 0.073959067463874817 0.00019359000725671649 1.7618711353861727e-05 ]
    -Original image size: [ 752 , 480 ]
    -Current image size: [ 752 , 480 ]
    -Sequence FPS: 20
    -Scale factor of image pyramid: 1.2000000476837158
    -Levels of image pyramid: 4
    -Features per image: 675
    -Detector threshold: 0.0099999997764825821
    -Load model path: /home/llm/ROS/HFNet_SLAM/model/HFNet-RT/

WARNING: onnx2trt_utils.cpp:377: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
Loaded 1970277 bytes of timing cache from /home/llm/ROS/HFNet_SLAM/model/HFNet-RT//HF-Net.cache
WARNING: TensorRT encountered issues when converting weights between types and that could affect accuracy.
WARNING: If this is not the desired behavior, please modify the weights or retrain with regularization to adjust the magnitude of the weights.
WARNING: Check verbose logs for the list of affected weights.
WARNING: - 51 weights are affected by this issue: Detected subnormal FP16 values.
WARNING: - 31 weights are affected by this issue: Detected values less than smallest positive FP16 subnormal value and converted them to the FP16 minimum subnormalized value.
Loaded 1970277 bytes of timing cache from /home/llm/ROS/HFNet_SLAM/model/HFNet-RT//HF-Net.cache
Saved 1970277 bytes of timing cache to /home/llm/ROS/HFNet_SLAM/model/HFNet-RT//HF-Net.cache
Successfully loaded HFNet TensorRT model. Mode: ImageToLocalAndGlobal Shape: [1, 480, 752, 1]
Loaded 1970277 bytes of timing cache from /home/llm/ROS/HFNet_SLAM/model/HFNet-RT//HF-Net.cache
WARNING: TensorRT encountered issues when converting weights between types and that could affect accuracy.
WARNING: If this is not the desired behavior, please modify the weights or retrain with regularization to adjust the magnitude of the weights.
WARNING: Check verbose logs for the list of affected weights.
WARNING: - 13 weights are affected by this issue: Detected subnormal FP16 values.
WARNING: - 3 weights are affected by this issue: Detected values less than smallest positive FP16 subnormal value and converted them to the FP16 minimum subnormalized value.
Loaded 1970277 bytes of timing cache from /home/llm/ROS/HFNet_SLAM/model/HFNet-RT//HF-Net.cache
Saved 1970277 bytes of timing cache to /home/llm/ROS/HFNet_SLAM/model/HFNet-RT//HF-Net.cache
Successfully loaded HFNet TensorRT model. Mode: ImageToLocal Shape: [1, 400, 627, 1]
Loaded 1970277 bytes of timing cache from /home/llm/ROS/HFNet_SLAM/model/HFNet-RT//HF-Net.cache
WARNING: TensorRT encountered issues when converting weights between types and that could affect accuracy.
WARNING: If this is not the desired behavior, please modify the weights or retrain with regularization to adjust the magnitude of the weights.
WARNING: Check verbose logs for the list of affected weights.
WARNING: - 13 weights are affected by this issue: Detected subnormal FP16 values.
WARNING: - 3 weights are affected by this issue: Detected values less than smallest positive FP16 subnormal value and converted them to the FP16 minimum subnormalized value.
Loaded 1970277 bytes of timing cache from /home/llm/ROS/HFNet_SLAM/model/HFNet-RT//HF-Net.cache
Saved 1970277 bytes of timing cache to /home/llm/ROS/HFNet_SLAM/model/HFNet-RT//HF-Net.cache
Successfully loaded HFNet TensorRT model. Mode: ImageToLocal Shape: [1, 333, 522, 1]
Loaded 1970277 bytes of timing cache from /home/llm/ROS/HFNet_SLAM/model/HFNet-RT//HF-Net.cache
WARNING: TensorRT encountered issues when converting weights between types and that could affect accuracy.
WARNING: If this is not the desired behavior, please modify the weights or retrain with regularization to adjust the magnitude of the weights.
WARNING: Check verbose logs for the list of affected weights.
WARNING: - 13 weights are affected by this issue: Detected subnormal FP16 values.
WARNING: - 3 weights are affected by this issue: Detected values less than smallest positive FP16 subnormal value and converted them to the FP16 minimum subnormalized value.
Loaded 1970277 bytes of timing cache from /home/llm/ROS/HFNet_SLAM/model/HFNet-RT//HF-Net.cache
Saved 1970277 bytes of timing cache to /home/llm/ROS/HFNet_SLAM/model/HFNet-RT//HF-Net.cache
Successfully loaded HFNet TensorRT model. Mode: ImageToLocal Shape: [1, 278, 435, 1]
Initialization of Atlas from scratch 
Creation of new map with id: 0
Creation of new map with last KF id: 0
There are 1 cameras in the atlas
Camera 0 is pinhole
Starting the Viewer
First KF:0; Map init KF:0
New Map created with 423 points
Init frame id: 0

Besides, this project is based on the famous 'ORB-SLAM3' project and shares a lot of framework with it. If you are unfamiliar with it, I would suggest that you should study ORB-SLAM3 as your first step.

antithing commented 1 year ago

Thank you again! I have taken out your HFExtractor class and am testing it with Superglue to add robust matching.

However, i only see a few matches, even when using the exact same features. Do you know if Superglue will work correctly with HFNet?

My code is:

(*mpExtractorLeft)(image, vKeyPoints, localDescriptors, globalDescriptors); //to extract features

 //matching
    std::string config_path = "config.yaml";
    std::string model_dir = "weights";
    Configs configs(config_path, model_dir);
    std::cout << "Building inference engine......" << std::endl;

    Eigen::Matrix<double, 259, Eigen::Dynamic> feature_points0, feature_points1;
    std::vector<cv::DMatch> superglue_matches;

    //fill eigen from desc and keys
    //image 1
    feature_points0.resize(259, vKeyPoints.size());
    for (int i = 0; i < vKeyPoints.size(); i++) {
        feature_points0(0, i) = 1;
    }

    for (int j = 0; j < vKeyPoints.size(); ++j) {
        feature_points0(1, j) = vKeyPoints[j].pt.x;
        feature_points0(2, j) = vKeyPoints[j].pt.y;
    }

    for (int m = 3; m < 259; ++m) {
        for (int n = 0; n < localDescriptors.rows; ++n) //rows is kpnts size, cols is 256
        {
            feature_points0(n, m) = localDescriptors.at<double>(n,m);
        }
    }

    feature_points1 = feature_points0;

    auto superglue = std::make_shared<SuperGlue>(configs.superglue_config);
    if (!superglue->build()) {
        std::cerr << "Error in SuperGlue building engine. Please check your onnx model path." << std::endl;
        return 0;
    }
    auto start = std::chrono::high_resolution_clock::now();
    superglue->matching_points(feature_points0, feature_points1, superglue_matches,true);
    auto end = std::chrono::high_resolution_clock::now();
    auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(end - start);

    std::cout << "matching took: " << duration.count() << std::endl;

But from 712 features (the exact same image) I get 35 matches. Can you see anything I am doing wrong here?

Thank you!

LiuLimingCode commented 1 year ago

For your example, I am sorry I am not familiar with Superglue. I assume that maybe there is something wrong as you just matched two sets of the same features. Frankly speaking, I have considered introducing Superglue to HF-Net in the initial phrase. But I would say that Superglue is unsuitable for a real-time SLAM system. Superglue is specially designed for accuracy instead of efficiency, while HF-Net is designed conversely. Using Superglue might corrupt the real-time performance. For example, in ORB-SLAM3 and HFNet-SLAM, the tracking threads need to track the keypoints between the current frame and a local map composed of dozens of frames (TrackLocalMap() function in ORB-SLAM3 source code). This process requires dozens of frame-to-frame matching and should be finished within 10 ms. Superglue, I assume, cannot meet this requirement. Anyway, I wish you good luck working with Superglue.