razayunus / ManhattanSLAM

A Planar RGB-D SLAM which utilizes Manhattan World structure to provide optimal camera pose trajectory while also providing a sparse reconstruction containing points, lines and planes, and a dense surfel-based reconstruction.
Other
151 stars 40 forks source link

Error when non detect keypoints or lines /non relocalize tracking #3

Open Wavelet303 opened 2 years ago

Wavelet303 commented 2 years ago

Hi razayunus, First of all congratulations on your great work. I have successfully migrated the project to Visual Studio 2019. The examples (TUM RGB-D, ICL-NUI; and TAMU RGB-D) work perfectly and in real time (0.05 mean tracking time) on my computer . My configuration is: OpenCV 3.4.5 PCL 1.8.1 Eigen3. 3.3.9 Pangolin DBoW2: Included in Thirdparty folder g2o: Included in Thirdparty folder

Subsequently, I tried to use the application in live time in an indoor room using the Intel RealSense D435 as an RGB-D camera. It works correctly except when the application does not detect any keypoints or lines (e.g. the camera is covered with the hand) in which an error occurs and closes the app. “Error: keypoint list is empty OpenCV: terminate handler is called! The last OpenCV error is: OpenCV(3.4.5) Error: Assertion failed (type == src2.type() && src1.cols == src2.cols && (type == CV_32F || type == CV_8U)) in cv::batchDistance” On the other hand, when the system loses tracking (it occurs very rarely) it is not able to find relocation tracking again.
Do you have any idea why these two things occur?

My configuration file for Intel d435 is: Camera.fy: 617.73 Camera.fy: 618.19 Camera.cx: 317.76 Camera.cy: 248.28 Camera.k1: 0 Camera.k2: 0 Camera.p1: 0 Camera.p2: 0 Camera.k3: 0 Camera.width: 640 Camera.height: 480 Camera.fps: 30.0 Camera.bf: 30.5 Camera.RGB: 1 ThDepth: 40.0 DepthMapFactor: 1000.0 ORBextractor.nFeatures: 1000 ORBextractor.scaleFactor: 1.2 ORBextractor.nLevels: 8
ORBextractor.iniThFAST: 20 ORBextractor.minThFAST: 7 Viewer.KeyFrameSize: 0.05 Viewer.KeyFrameLineWidth: 1 Viewer.GraphLineWidth: 0.9 Viewer.PointSize: 2 Viewer.CameraSize: 0.08 Viewer.CameraLineWidth: 3 Viewer.ViewpointX: 0 Viewer.ViewpointY: -0.7 Viewer.ViewpointZ: -1.8 Viewer.ViewpointF: 500 Plane.AssociationDisRef: 0.05 Plane.AssociationAngRef: 0.985 # 10 degree Plane.VerticalThreshold: 0.08716 # 85 degree Plane.ParallelThreshold: 0.9962 # 5 degree Plane.AngleInfo: 0.5 Plane.DistanceInfo: 50 Plane.Chi: 100 Plane.VPChi: 50 Plane.ParallelInfo: 0.5 Plane.VerticalInfo: 0.5 Plane.DistanceThreshold: 0.04 Plane.MFVerticalThreshold: 0.01 Surfel.distanceFar: 30.0 Surfel.distanceNear: 0.5 SavePath.Keyframe: "KeyFrameTrajectory.txt" SavePath.Frame: "CameraTrajectory.txt"

An the D435-I example code is: /

include

include

include

include

include

include

include <opencv2/core/core.hpp>

include <librealsense2/rs.hpp>

include

void stop_falg_detection();

// A flag to indicate whether a key had been pressed. std::atomic_bool stop_flag(false);

int main(int argc, char** argv) try {

if (argc != 3) {
    cerr << endl << "Usage: ./rgbd_realsense path_to_vocabulary path_to_settings" << endl;
    return EXIT_SUCCESS;
}

std::cout << "Querying Realsense device info..." << std::endl;

// Create librealsense context for managing devices
rs2::context ctx;
auto devs = ctx.query_devices();  // Get device list
int device_num = devs.size();
std::cout << "Device number: " << device_num << std::endl; // Device amount

// Query the info of first device
rs2::device dev = devs[0];  // If no device conneted, a rs2::error exception will be raised
// Device serial number (different for each device, can be used for searching device when having mutiple devices)
std::cout << "Serial number: " << dev.get_info(RS2_CAMERA_INFO_SERIAL_NUMBER) << std::endl;

rs2::config cfg;
// Default it will config all the devices,you can specify the device index you want to config (query by serial number)
// Config color stream: 640*480, frame format: BGR, FPS: 30
cfg.enable_stream(RS2_STREAM_COLOR, 640, 480, RS2_FORMAT_RGB8, 30);  // BGR8 correspond to CV_8UC3 in OpenCV
// Config depth stream: 640*480, frame format: Z16, FPS: 30
cfg.enable_stream(RS2_STREAM_DEPTH, 1280, 720, RS2_FORMAT_Z16, 30); // Z16 corresponds to CV_16U in OpenCV

std::cout << "Config RGB frame format to 8-channal RGB" << std::endl;
std::cout << "Config RGB and depth FPS to 30" << std::endl;

rs2::pipeline pipe;
pipe.start(cfg);
rs2::align align_to_color(RS2_STREAM_COLOR);
// Block program until frames arrive
rs2::frameset data = pipe.wait_for_frames();

rs2::depth_frame depth = data.get_depth_frame();
rs2::video_frame color = data.get_color_frame();

rs2::stream_profile depth_profile = depth.get_profile();
rs2::stream_profile color_profile = color.get_profile();

// Get RGB camera intrinsics
// Note that the change of config will cause the change of intrinsics
rs2::video_stream_profile cvsprofile(color_profile);
rs2::video_stream_profile dvsprofile(depth_profile);
rs2_intrinsics color_intrinsics = cvsprofile.get_intrinsics();
rs2_intrinsics depth_intrinsics = dvsprofile.get_intrinsics();

const int color_width = color_intrinsics.width;
const int color_height = color_intrinsics.height;
const int depth_width = 640; //depth_intrinsics.width;
const int depth_height = 480;// depth_intrinsics.height;

std::cout << "RGB Frame width: " << color_width << std::endl;
std::cout << "RGB Frame height: " << color_height << std::endl;
std::cout << "Depth Frame width: " << depth_width << std::endl;
std::cout << "Depth Frame height: " << depth_height << std::endl;
std::cout << "RGB camera intrinsics:" << std::endl;
std::cout << "fx: " << color_intrinsics.fx << std::endl;
std::cout << "fy: " << color_intrinsics.fy << std::endl;
std::cout << "cx: " << color_intrinsics.ppx << std::endl;
std::cout << "cy: " << color_intrinsics.ppy << std::endl;
std::cout << "RGB camera distortion coeffs:" << std::endl;
std::cout << "k1: " << color_intrinsics.coeffs[0] << std::endl;
std::cout << "k2: " << color_intrinsics.coeffs[1] << std::endl;
std::cout << "p1: " << color_intrinsics.coeffs[2] << std::endl;
std::cout << "p2: " << color_intrinsics.coeffs[3] << std::endl;
std::cout << "k3: " << color_intrinsics.coeffs[4] << std::endl;
//std::cout << "RGB camera distortion model: " << color_intrinsics.model << std::endl;

std::cout << "* Please adjust the parameters in config file accordingly *" << std::endl;

// Create SLAM system. It initializes all system threads and gets ready to process frames.
ORB_SLAM2::System SLAM(argv[1], argv[2], true);

// Vector for tracking time statistics
vector<float> vtimes_track;

std::thread stop_detect_thread = std::thread(stop_falg_detection);

std::cout << std::endl << "-------" << std::endl;
std::cout << "Start processing realsense stream ..." << std::endl;
std::cout << "Use 'p + enter' to end the system" << std::endl;

while (!stop_flag) {
    data = pipe.wait_for_frames();
    data = align_to_color.process(data);
    depth = data.get_depth_frame();
    color = data.get_color_frame();

    double time_stamp = data.get_timestamp();

    cv::Mat im_D(cv::Size(depth_width, depth_height), CV_16U, (void*)depth.get_data(), cv::Mat::AUTO_STEP);
    cv::Mat im_RGB(cv::Size(color_width, color_height), CV_8UC3, (void*)color.get_data(), cv::Mat::AUTO_STEP);

    std::chrono::steady_clock::time_point t1 = std::chrono::steady_clock::now();

    // Pass the image to the SLAM system
    SLAM.Track(im_RGB, im_D, time_stamp);

    std::chrono::steady_clock::time_point t2 = std::chrono::steady_clock::now();
    double ttrack = std::chrono::duration_cast<std::chrono::duration<double>>(t2 - t1).count();
    vtimes_track.push_back(ttrack);
}

stop_detect_thread.join();

// Stop all threads
SLAM.Shutdown();

// Tracking time statistics
sort(vtimes_track.begin(), vtimes_track.end());
float time_total = 0;
for (size_t i = 0; i < vtimes_track.size(); i++) {
    time_total += vtimes_track[i];
}

std::cout << "-------" << std::endl << std::endl;
std::cout << "median tracking time: " << vtimes_track[vtimes_track.size() / 2] << std::endl;
std::cout << "mean tracking time: " << time_total / vtimes_track.size() << std::endl;

// Save camera trajectory
SLAM.SaveTrajectoryTUM("CameraTrajectory.txt");
SLAM.SaveKeyFrameTrajectoryTUM("KeyFrameTrajectory.txt");

return EXIT_SUCCESS;

} catch (const rs2::error& e) { // Capture device exception std::cerr << "RealSense error calling " << e.get_failed_function() << "(" << e.get_failed_args() << "):\n " << e.what() << std::endl; return EXIT_FAILURE; } catch (const std::exception& e) { std::cerr << "Other error : " << e.what() << std::endl; return EXIT_FAILURE; }

void stop_falg_detection() { char c; while (!stop_flag) { c = std::getchar(); if (c == 'p') { stop_flag = true;; } } }**

Thanks for everything

Wavelet303 commented 2 years ago

the first question is resolved. in the LSDMatcher.cpp file, the following condition must be added in line 214: if (ldesc1.rows > 0 && ldesc2.rows > 0) { bfm->knnMatch(ldesc1, ldesc2, lmatches, 2);

        double nn_dist_th, nn12_dist_th;
        const float minRatio = 1.0f / 1.5f;
        lineDescriptorMAD(lmatches, nn_dist_th, nn12_dist_th);
        nn12_dist_th = nn12_dist_th * 0.5;
        sort(lmatches.begin(), lmatches.end(), sort_descriptor_by_queryIdx());
        for (int i = 0; i < lmatches.size(); i++) {
            int qdx = lmatches[i][0].queryIdx;
            int tdx = lmatches[i][0].trainIdx;
            double dist_12 = lmatches[i][0].distance / lmatches[i][1].distance;
            if (dist_12 < minRatio) {
                MapLine* mapLine = vpMapLinesKF[qdx];

                if (mapLine) {
                    vpMapLineMatches[tdx] = mapLine;
                    nmatches++;
                }

            }
        }
    }
razayunus commented 2 years ago

Hi, thanks for your interest and for pointing out issues. Unfortunately, I cannot tell you why relocalization fails as I have to test/debug it myself. I will give an update after having a look.