esteimle commented 2 years ago

Hi. I've been trying for a few days to get my USB camera to work with ORBSLAM3. I started with the mono_realsense_t265.cc as my base, made a copy of that file, and then added opencv cv::VideoCapture to connect to my USB camera. When I debug, I can see the frames coming in as well as the timestamps updating with each frame. I displayed some pictures as well to verify they are real. I also calibrated my camera and modified the t265 yaml file to use those parameters. Has anyone else hit something similar? My next steps are going to be digging into the code step by step but that's going to take a while ;) Thank you for any advice.

Command line output `~/workspace/ORB_SLAM3$ ./Examples/Monocular/mono_razor Vocabulary/ORBvoc.txt ./Examples/Monocular/Razor.yaml

ORB-SLAM3 Copyright (C) 2017-2020 Carlos Campos, Richard Elvira, Juan J. Gómez, José M.M. Montiel and Juan D. Tardós, University of Zaragoza. ORB-SLAM2 Copyright (C) 2014-2016 Raúl Mur-Artal, José M.M. Montiel and Juan D. Tardós, University of Zaragoza. This program comes with ABSOLUTELY NO WARRANTY; This is free software, and you are welcome to redistribute it under certain conditions. See LICENSE.txt.

Input sensor was set to: Monocular Loading settings from ./Examples/Monocular/Razor.yaml Camera1.k1 optional parameter does not exist... -Loaded camera 1 Camera.newHeight optional parameter does not exist... Camera.newWidth optional parameter does not exist... -Loaded image info -Loaded ORB settings Viewer.imageViewScale optional parameter does not exist... -Loaded viewer settings System.LoadAtlasFromFile optional parameter does not exist... System.SaveAtlasToFile optional parameter does not exist... -Loaded Atlas settings System.thFarPoints optional parameter does not exist... -Loaded misc parameters

SLAM settings: -Camera 1 parameters (Pinhole): [ 243.38832092285156 241.70191955566406 365.0584716796875 238.09236145019531 ] -Original image size: [ 640 , 480 ] -Current image size: [ 640 , 480 ] -Sequence FPS: 60 -Features per image: 1000 -ORB scale factor: 1.2000000476837158 -ORB number of scales: 8 -Initial FAST threshold: 20 -Min FAST threshold: 7

Loading ORB Vocabulary. This could take a while... Vocabulary loaded!

Initialization of Atlas from scratch Creation of new map with id: 0 Creation of new map with last KF id: 0 Seq. Name: There are 1 cameras in the atlas Camera 0 is pinhole Starting the Viewer [ WARN:12] global /home/pradeep/workspace/opencv_build/opencv/modules/videoio/src/cap_gstreamer.cpp (956) open OpenCV | GStreamer warning: Cannot query video position: status=0, value=-1, duration=-1 `

My Config file `%YAML:1.0

--------------------------------------------------------------------------------------------

Camera Parameters. Adjust them!

--------------------------------------------------------------------------------------------

File.version: "1.0"

Camera.type: "PinHole"

Left Camera calibration and distortion parameters (OpenCV)

Camera1.fx: 243.38832282 Camera1.fy: 241.7019257 Camera1.cx: 365.05847057 Camera1.cy: 238.09236466

Kannala-Brandt distortion parameters

Camera1.k1: -0.00530046410858631

Camera1.k2: 0.0423333682119846

Camera1.k3: -0.03949885815382

Camera1.k4: 0.00682387687265873

pinhole

Camera.k1: 0.02635559 Camera.k2: -0.01760739 Camera.p1: 0.00409038 Camera.p2: 0.00891666 Camera.k3: 0.00497196

need to modify orblasm to take this

Camera.k4: 0.5483986735343933

Camera.k5: -2.6671268939971924

Camera.k6: 1.5503228902816772

#

you can duplicate the codes that reading parameters three times in "tracking.cc" with k4,k5,k6 you have:

node = fSettings["Camera.k3"];

if(!node.empty() && node.isReal())

{

mDistCoef.resize(5);

mDistCoef.at(4) = node.real();

#

if(b_miss_params)

{

return false;

}

Camera resolution

Camera.width: 640 Camera.height: 480

Camera frames per second

Camera.fps: 60

Color order of the images (0: BGR, 1: RGB. It is ignored if images are grayscale)

Camera.RGB: 1

--------------------------------------------------------------------------------------------

ORB Parameters

--------------------------------------------------------------------------------------------

ORB Extractor: Number of features per image

ORBextractor.nFeatures: 1000 # Tested with 1250

ORB Extractor: Scale factor between levels in the scale pyramid

ORBextractor.scaleFactor: 1.2

ORB Extractor: Number of levels in the scale pyramid

ORBextractor.nLevels: 8

ORB Extractor: Fast threshold

Image is divided in a grid. At each cell FAST are extracted imposing a minimum response.

Firstly we impose iniThFAST. If no corners are detected we impose a lower value minThFAST

You can lower these values if your images have low contrast

ORBextractor.iniThFAST: 20 # 20 ORBextractor.minThFAST: 7 # 7

--------------------------------------------------------------------------------------------

Viewer Parameters

--------------------------------------------------------------------------------------------

Viewer.KeyFrameSize: 0.05 Viewer.KeyFrameLineWidth: 1.0 Viewer.GraphLineWidth: 0.9 Viewer.PointSize: 2.0 Viewer.CameraSize: 0.08 Viewer.CameraLineWidth: 3.0 Viewer.ViewpointX: 0.0 Viewer.ViewpointY: -0.7 Viewer.ViewpointZ: -3.5 Viewer.ViewpointF: 500.0 `

My Test Code (I see images and timestamps working in debug) `/**

This file is part of ORB-SLAM3
Copyright (C) 2017-2021 Carlos Campos, Richard Elvira, Juan J. Gómez Rodríguez, José M.M. Montiel and Juan D. Tardós, University of Zaragoza.
Copyright (C) 2014-2016 Raúl Mur-Artal, José M.M. Montiel and Juan D. Tardós, University of Zaragoza.
ORB-SLAM3 is free software: you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
ORB-SLAM3 is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even
the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with ORB-SLAM3.
If not, see http://www.gnu.org/licenses/. */

include

include <opencv2/core/core.hpp>

//#include <librealsense2/rs.hpp>

include

using namespace std;

bool b_continue_session;

void exit_loop_handler(int s){ cout << "Finishing session" << endl; b_continue_session = false;

}

int main(int argc, char **argv) {

if(argc < 3 || argc > 4)
{
    cerr << endl << "Usage: ./mono_realsense_t265 path_to_vocabulary path_to_settings (trajectory_file_name)" << endl;
    return 1;
}

string file_name;
bool bFileName = false;

if (argc == 4)
{
    file_name = string(argv[argc-1]);
    bFileName = true;
}

struct sigaction sigIntHandler;

sigIntHandler.sa_handler = exit_loop_handler;
sigemptyset(&sigIntHandler.sa_mask);
sigIntHandler.sa_flags = 0;

sigaction(SIGINT, &sigIntHandler, NULL);
b_continue_session = true;

// Declare RealSense pipeline, encapsulating the actual device and sensors
//rs2::pipeline pipe;
// Create a configuration for configuring the pipeline with a non default profile
//rs2::config cfg;

// Enable the left camera
//cfg.enable_stream(RS2_STREAM_FISHEYE, 1, RS2_FORMAT_Y8);
//cfg.enable_stream(RS2_STREAM_FISHEYE, 2, RS2_FORMAT_Y8);

//rs2::pipeline_profile pipe_profile = pipe.start(cfg);

cout.precision(17);

/*cout << "Start processing sequence ..." << endl;
cout << "Images in the sequence: " << nImages << endl;
cout << "IMU data in the sequence: " << nImu << endl << endl;*/

// Create SLAM system. It initializes all system threads and gets ready to process frames.
ORB_SLAM3::System SLAM(argv[1],argv[2],ORB_SLAM3::System::MONOCULAR, true);
float imageScale = SLAM.GetImageScale();

cv::Mat imCV;

//rs2::stream_profile fisheye_stream = pipe_profile.get_stream(RS2_STREAM_FISHEYE, 1);
//rs2_intrinsics intrinsics = fisheye_stream.as<rs2::video_stream_profile>().get_intrinsics();
int width_img = 1920;
int height_img = 1080;

double t_resize = 0.f;
double t_track = 0.f;

cv::VideoCapture cap(0); // open the default camera
if(!cap.isOpened()){  // check if we succeeded
    std::cout << "Whoops could not open camera" << std::endl;
    return -1;
}

double timestamp_ms = 0;

cv::Mat frame;

while(b_continue_session)
{
    //cv::Ptr<cv::CLAHE> clahe = cv::createCLAHE(3.0, cv::Size(8, 8));
    // Get the stream from the device
    //rs2::frameset frame_set = pipe.wait_for_frames();

    //double timestamp_ms = frame_set.get_timestamp(); //RS2_FRAME_METADATA_SENSOR_TIMESTAMP
    // cout << "timestamp: " << timestamp_ms << endl;

    //if(rs2::video_frame image_frame = frame_set.first_or_default(RS2_STREAM_FISHEYE))
    //{
        //rs2::video_frame frame = frame_set.get_fisheye_frame(1); // Left image

    //imCV = cv::Mat(cv::Size(width_img, height_img), CV_8UC1, (void*)(frame.get_data()), cv::Mat::AUTO_STEP);
    cap >> frame;
    timestamp_ms = cap.get(cv::CAP_PROP_POS_MSEC);
    //std::cout << "timestamp: " << timestamp_ms << std::endl;
    cv::cvtColor(frame, imCV, cv::COLOR_BGR2GRAY);
    //cv::imshow("inputframe", imCV);

        if(imageScale != 1.f)
        {

ifdef REGISTER_TIMES

#ifdef COMPILEDWITHC11
            std::chrono::steady_clock::time_point t_Start_Resize = std::chrono::steady_clock::now();
#else
            std::chrono::monotonic_clock::time_point t_Start_Resize = std::chrono::monotonic_clock::now();
#endif

endif

            int width = imCV.cols * imageScale;
            int height = imCV.rows * imageScale;
            cv::resize(imCV, imCV, cv::Size(width, height));

ifdef REGISTER_TIMES

#ifdef COMPILEDWITHC11
            std::chrono::steady_clock::time_point t_End_Resize = std::chrono::steady_clock::now();
#else
            std::chrono::monotonic_clock::time_point t_End_Resize = std::chrono::monotonic_clock::now();
#endif
            t_resize = std::chrono::duration_cast<std::chrono::duration<double,std::milli> >(t_End_Resize - t_Start_Resize).count();
            SLAM.InsertResizeTime(t_resize);

endif

        }

        // clahe
        //clahe->apply(imLeft,imLeft);
        //clahe->apply(imRight,imRight);

ifdef REGISTER_TIMES

ifdef COMPILEDWITHC11

        std::chrono::steady_clock::time_point t1 = std::chrono::steady_clock::now();

else

        std::chrono::monotonic_clock::time_point t1 = std::chrono::monotonic_clock::now();

endif

        // Pass the image to the SLAM system
        SLAM.TrackMonocular(imCV, timestamp_ms);

ifdef REGISTER_TIMES

ifdef COMPILEDWITHC11

        std::chrono::steady_clock::time_point t2 = std::chrono::steady_clock::now();

else

        std::chrono::monotonic_clock::time_point t2 = std::chrono::monotonic_clock::now();

endif

        t_track = t_resize + std::chrono::duration_cast<std::chrono::duration<double,std::milli> >(t2 - t1).count();
        SLAM.InsertTrackTime(t_track);

endif

    //}
}

//pipe.stop();

// Stop all threads
SLAM.Shutdown();

return 0;

} `

Maithilishetty commented 2 years ago

@esteimle I am having a similar issue while running the examples on ROS. Have you managed to fix this yet?

esteimle commented 2 years ago

yeah, just an a hour ago. I lowered the resolution to 640x480 in the yaml file, then also lowered it in the opencv capture (not resizing it later but setting the camera to input 640x480). It seemed to get lost pretty easy but it made a 1/4 map of my office :) I think the high res input from my camera was bogging down the opencv camera capture. Either that or I had my resolutions misaligned between yaml and real life.

Maithilishetty commented 2 years ago

Oh great, thanks for the response! I do not believe that's a possible issue for me since I am just running the EuRoC rosbag file from the examples section. I am not sure how to fix it.

UZ-SLAMLab / ORB_SLAM3

black screen and "waiting for images" when trying mono with USB camera #501

--------------------------------------------------------------------------------------------

Camera Parameters. Adjust them!

--------------------------------------------------------------------------------------------

Left Camera calibration and distortion parameters (OpenCV)

Kannala-Brandt distortion parameters

Camera1.k1: -0.00530046410858631

Camera1.k2: 0.0423333682119846

Camera1.k3: -0.03949885815382

Camera1.k4: 0.00682387687265873

pinhole

need to modify orblasm to take this

Camera.k4: 0.5483986735343933

Camera.k5: -2.6671268939971924

Camera.k6: 1.5503228902816772

you can duplicate the codes that reading parameters three times in "tracking.cc" with k4,k5,k6 you have:

node = fSettings["Camera.k3"];

if(!node.empty() && node.isReal())

{

mDistCoef.resize(5);

mDistCoef.at(4) = node.real();

if(b_miss_params)

{

return false;

}

Camera resolution

Camera frames per second

Color order of the images (0: BGR, 1: RGB. It is ignored if images are grayscale)

--------------------------------------------------------------------------------------------

ORB Parameters

--------------------------------------------------------------------------------------------

ORB Extractor: Number of features per image

ORB Extractor: Scale factor between levels in the scale pyramid

ORB Extractor: Number of levels in the scale pyramid

ORB Extractor: Fast threshold

Image is divided in a grid. At each cell FAST are extracted imposing a minimum response.

Firstly we impose iniThFAST. If no corners are detected we impose a lower value minThFAST

You can lower these values if your images have low contrast

--------------------------------------------------------------------------------------------

Viewer Parameters

--------------------------------------------------------------------------------------------

include

include

include

include

include

include

include

include

include <opencv2/core/core.hpp>

include

ifdef REGISTER_TIMES

endif

ifdef REGISTER_TIMES

endif

ifdef REGISTER_TIMES

ifdef COMPILEDWITHC11

else

endif

endif

ifdef REGISTER_TIMES

ifdef COMPILEDWITHC11

else

endif

endif