Running OpenVINS for very long durations

tianyilim commented 3 weeks ago

Hello, thanks for the good work and great code!

I've been trying to get OpenVINS to run on large-scale, long datasets, in particular the Vision Benchmark in Rome dataset. This dataset is quite interesting as it contains handheld sequences > 20min and > 2km in length.

Initially, OpenVINS seems to do quite well, but as the trajectory gets longer, I find the covariance estimates also increase, and I guess without place recognition techniques to "anchor" it, the trajectory eventually drifts away Youtube video (skip to ~3:15 to see filter divergence).

Maybe this is a limitation of a pure EKF-based VIO system, but are there any things that i can try to tackle this issue? E.g force reinitialization when covariance increases, tuning configs, etc.? Thanks in advance!

Overall Config

```yaml %YAML:1.0 # need to specify the file type at the top! verbosity: "INFO" # ALL, DEBUG, INFO, WARNING, ERROR, SILENT use_fej: true # if first-estimate Jacobians should be used (enable for good consistency) integration: "rk4" # discrete, rk4, analytical (if rk4 or analytical used then analytical covariance propagation is used) use_stereo: true # if we have more than 1 camera, if we should try to track stereo constraints between pairs max_cameras: 2 # how many cameras we have 1 = mono, 2 = stereo, >2 = binocular (all mono tracking) calib_cam_extrinsics: true # if the transform between camera and IMU should be optimized R_ItoC, p_CinI calib_cam_intrinsics: true # if camera intrinsics should be optimized (focal, center, distortion) calib_cam_timeoffset: true # if timeoffset between camera and IMU should be optimized calib_imu_intrinsics: true # if imu intrinsics should be calibrated (rotation and skew-scale matrix) calib_imu_g_sensitivity: false # if gyroscope gravity sensitivity (Tg) should be calibrated max_clones: 20 # how many clones in the sliding window # ! set a bit larger hopefully for local 'loop closure' max_slam: 100 # number of features in our state vector max_slam_in_update: 25 # update can be split into sequential updates of batches, how many in a batch max_msckf_in_update: 50 # how many MSCKF features to use in the update # ! VBR Spagna initializes okay dt_slam_delay: 0.0 # delay before initializing (helps with stability from bad initialization...) # ! VBR Spagna weirdness in grav vector gravity_mag: 9.7 # magnitude of gravity in this location feat_rep_msckf: "GLOBAL_3D" feat_rep_slam: "ANCHORED_MSCKF_INVERSE_DEPTH" feat_rep_aruco: "ANCHORED_MSCKF_INVERSE_DEPTH" # zero velocity update parameters we can use # we support either IMU-based or disparity detection. try_zupt: false zupt_chi2_multipler: 0 # set to 0 for only disp-based zupt_max_velocity: 0.05 zupt_noise_multiplier: 50 zupt_max_disparity: 0.5 # set to 0 for only imu-based zupt_only_at_beginning: true # ================================================================== # ================================================================== init_window_time: 2.0 # how many seconds to collect initialization information init_imu_thresh: 1.5 # threshold for variance of the accelerometer to detect a "jerk" in motion init_max_disparity: 10.0 # max disparity to consider the platform stationary (dependent on resolution) init_max_features: 50 # how many features to track during initialization (saves on computation) init_dyn_use: true # if dynamic initialization should be used init_dyn_mle_opt_calib: false # if we should optimize calibration during intialization (not recommended) init_dyn_mle_max_iter: 50 # how many iterations the MLE refinement should use (zero to skip the MLE) init_dyn_mle_max_time: 0.05 # how many seconds the MLE should be completed in init_dyn_mle_max_threads: 8 # how many threads the MLE should use init_dyn_num_pose: 6 # number of poses to use within our window time (evenly spaced) init_dyn_min_deg: 10.0 # orientation change needed to try to init init_dyn_inflation_ori: 10 # what to inflate the recovered q_GtoI covariance by init_dyn_inflation_vel: 100 # what to inflate the recovered v_IinG covariance by init_dyn_inflation_bg: 10 # what to inflate the recovered bias_g covariance by init_dyn_inflation_ba: 100 # what to inflate the recovered bias_a covariance by init_dyn_min_rec_cond: 1e-12 # reciprocal condition number thresh for info inversion init_dyn_bias_g: [ 0.0053, -0.0014, -0.0000 ] # initial gyroscope bias guess init_dyn_bias_a: [ 0.1118, 0.1242, 0.3905 ] # initial accelerometer bias guess # init_dyn_bias_g: [ 0.0, 0.0, 0.0 ] # initial gyroscope bias guess # init_dyn_bias_a: [ 0.0, 0.0, 0.0 ] # initial accelerometer bias guess # ================================================================== # ================================================================== record_timing_information: false # if we want to record timing information of the method record_timing_filepath: "/tmp/traj_timing.txt" # https://docs.openvins.com/eval-timing.html#eval-ov-timing-flame # if we want to save the simulation state and its diagional covariance # use this with rosrun ov_eval error_simulation save_total_state: false filepath_est: "/tmp/ov_estimate.txt" filepath_std: "/tmp/ov_estimate_std.txt" filepath_gt: "/tmp/ov_groundtruth.txt" # ================================================================== # ================================================================== # our front-end feature tracking parameters # ! "real time" performance is not key so I increased here a lot # we have a KLT and descriptor based (KLT is better implemented...) use_klt: true # if true we will use KLT, otherwise use a ORB descriptor + robust matching num_pts: 400 # number of points (per camera) we will extract and try to track fast_threshold: 10 # threshold for fast extraction (warning: lower threshs can be expensive) grid_x: 5 # extraction sub-grid count for horizontal direction (uniform tracking) grid_y: 5 # extraction sub-grid count for vertical direction (uniform tracking) min_px_dist: 15 # distance between features (features near each other provide less information) knn_ratio: 0.65 # descriptor knn threshold for the top two descriptor matches track_frequency: 30.0 # frequency we will perform feature tracking at (in frames per second / hertz) # ! raw resolution is quite high. Found that this helps tracking downsample_cameras: true # will downsample image in half if true num_opencv_threads: 4 # -1: auto, 0-1: serial, >1: number of threads histogram_method: "CLAHE" # NONE, HISTOGRAM, CLAHE # aruco tag tracker for the system # DICT_6X6_1000 from https://chev.me/arucogen/ use_aruco: false num_aruco: 1024 downsize_aruco: true # ================================================================== # ================================================================== # camera noises and chi-squared threshold multipliers up_msckf_sigma_px: 1 up_msckf_chi2_multipler: 1 up_slam_sigma_px: 1 up_slam_chi2_multipler: 1 up_aruco_sigma_px: 1 up_aruco_chi2_multipler: 1 # masks for our images use_mask: false # imu and camera spacial-temporal # imu config should also have the correct noise values relative_config_imu: "vbr_imu_chain.yaml" relative_config_imucam: "vbr_imucam_chain.yaml" ```

IMU Config

```yaml %YAML:1.0 imu0: T_i_b: - [1.0, 0.0, 0.0, 0.0] - [0.0, 1.0, 0.0, 0.0] - [0.0, 0.0, 1.0, 0.0] - [0.0, 0.0, 0.0, 1.0] # ! these are the raw values from dataset calib # accelerometer_noise_density: 0.001602090212250569 # [ m / s^2 / sqrt(Hz) ] ( accel "white noise" ) # accelerometer_random_walk: 1.4361954439703917e-05 # [ m / s^3 / sqrt(Hz) ]. ( accel bias diffusion ) # gyroscope_noise_density: 0.00013361278165885748 # [ rad / s / sqrt(Hz) ] ( gyro "white noise" ) # gyroscope_random_walk: 3.1760920521174444e-06 # [ rad / s^2 / sqrt(Hz) ] ( gyro bias diffusion ) # ! Practically speaking, try to increase the noise values to make the filter more robust to noise (lets try 10x) accelerometer_noise_density: 0.01602090212250569 # [ m / s^2 / sqrt(Hz) ] ( accel "white noise" ) accelerometer_random_walk: 1.4361954439703917e-04 # [ m / s^3 / sqrt(Hz) ]. ( accel bias diffusion ) gyroscope_noise_density: 0.0013361278165885748 # [ rad / s / sqrt(Hz) ] ( gyro "white noise" ) gyroscope_random_walk: 3.1760920521174444e-05 # [ rad / s^2 / sqrt(Hz) ] ( gyro bias diffusion ) rostopic: /imu/data time_offset: 0.0 update_rate: 100.0 # three different modes supported: # "calibrated" (same as "kalibr"), "kalibr", "rpng" model: "kalibr" # how to get from Kalibr imu.yaml result file: # - Tw is imu0:gyroscopes:M: # - R_IMUtoGYRO: is imu0:gyroscopes:C_gyro_i: # - Ta is imu0:accelerometers:M: # - R_IMUtoACC not used by Kalibr # - Tg is imu0:gyroscopes:A: Tw: - [ 1.0, 0.0, 0.0 ] - [ 0.0, 1.0, 0.0 ] - [ 0.0, 0.0, 1.0 ] R_IMUtoGYRO: - [ 1.0, 0.0, 0.0 ] - [ 0.0, 1.0, 0.0 ] - [ 0.0, 0.0, 1.0 ] Ta: - [ 1.0, 0.0, 0.0 ] - [ 0.0, 1.0, 0.0 ] - [ 0.0, 0.0, 1.0 ] R_IMUtoACC: - [ 1.0, 0.0, 0.0 ] - [ 0.0, 1.0, 0.0 ] - [ 0.0, 0.0, 1.0 ] Tg: - [ 0.0, 0.0, 0.0 ] - [ 0.0, 0.0, 0.0 ] - [ 0.0, 0.0, 0.0 ] ```

Camera Config

```yaml %YAML:1.0 # ! values taken from the dataset calibration cam0: T_imu_cam: - [ 0.037625047406, -0.007947906881, 0.999260319728, 0.108169265686] - [-0.999241088127, -0.010385921748, 0.037541715839, 0.245263320981] - [ 0.010079861425, -0.999914478046, -0.008332645918, -0.060574570772] - [ 0.000000000000, 0.000000000000, 0.000000000000, 1.000000000000] cam_overlaps: [1] camera_model: pinhole distortion_coeffs: [-0.15442153959026667, 0.22415792664801637, 0.00028611532170761544, -0.0011553024948397013] distortion_model: radtan intrinsics: [1265.4157426547047, 1259.125033829179, 649.1764873455104, 349.11875004082356] resolution: [1388, 700] rostopic: /camera_left/image_raw timeshift_cam_imu: 0.0014439866415333613 cam1: T_imu_cam: - [ 0.047533765306, 0.002997724519, 0.998865133441, 0.117518620394] - [ -0.998851236543, -0.005926227938, 0.047550889366, -0.235758955147] - [ 0.006062046927, -0.999977946492, 0.002712584915, -0.066699484195] - [ 0.000000000000, 0.000000000000, 0.000000000000, 1.000000000000] cam_overlaps: [0] camera_model: pinhole distortion_coeffs: [-0.13964919973164264, 0.3032246444678759, -0.0006841993380782645, -0.0017521213083201822] distortion_model: radtan intrinsics: [1270.980680475146, 1266.6167801504557, 692.2080529147544, 346.53773474480096] resolution: [1388, 700] rostopic: /camera_right/image_raw timeshift_cam_imu: 0.0015213183603639074 ```

tianyilim commented 2 weeks ago

As an initial strategy, I wanted to reset the filter state, and use the last-known state as a initialization point.

Looking at #398, re-starting the ROS node is not an option for me.

I modified the Serial ROS1 bag reader code to attempt this, I'm not too sure if I did this properly -- but if someone could sanity check my impl, it would be greatly appreciated :)

https://github.com/tianyilim/open_vins/blob/3d7cec971e78a82aa1fd6f9c535e1b79b2b79630/ov_msckf/src/ros/ROS1Visualizer.cpp#L487

Essentially, I modified the ROS1Visualizer. After feeding a camera measurement, we can monitor the covariance and see if it has exceeded a threshold. If so, we trigger filter resetting

    // After processing, we need to check if the filter has diverged. We do this by a simple check of the covariance for now.
    std::shared_ptr<State> state = _app->get_state();
    std::vector<std::shared_ptr<Type>> statevars;
    statevars.push_back(state->_imu->pose()->p());
    statevars.push_back(state->_imu->pose()->q());
    Eigen::Matrix<double, 6, 6> covariance_posori = StateHelper::get_marginal_covariance(_app->get_state(), statevars);

    // Check if the covariance is too large
    const double frobnorm = covariance_posori.norm();
    PRINT_INFO(CYAN "@ %0.4f | Covariance Frobenius Norm: %.4f from VIOManager %s\n" RESET, message.timestamp, frobnorm,
               _app->name.c_str());

    // TODO to be tuned!
    constexpr double threshold_covariance = 1.0;
    if (frobnorm > threshold_covariance) {
      PRINT_ERROR("Covariance is too large! (%.3f) Resetting...\n", frobnorm);

      // Get rid of all features and clones in the state so that it is as "clean" as possible
      StateHelper::marginalize_all_clones(state);
      StateHelper::marginalize_all_slam(state);
      camera_queue.clear();

      // Naive way, but we need to reset the state
      reset_state = std::make_unique<Eigen::Matrix<double, 17, 1>>();
      reset_state->block(1, 0, 4, 1) = state->_imu->quat(); // Apart from position, we re-initialize with the curr state
      // reset_state->block(5, 0, 3, 1) = state->_imu->pos();  // cannot reset position so easily??
      reset_state->block(5, 0, 3, 1) = Eigen::Vector3d::Zero(); // Reset position
      reset_state->block(8, 0, 3, 1) = state->_imu->vel();     // take the current velocity
      reset_state->block(11, 0, 3, 1) = state->_imu->bias_g(); // take the current biases
      reset_state->block(14, 0, 3, 1) = state->_imu->bias_a(); //

      // Destroy the current application and create a new one to ensure all state is "reset"
      VioManagerOptions currParams = _app->get_params();
      _app.reset();
      _app = std::make_shared<VioManager>(currParams, "reset"+std::to_string(reset_counter++));
      PRINT_INFO(YELLOW "Reset VIOManager %s\n" RESET, _app->name.c_str());
    }

Knowing the previous state, we can then "abuse" the initialize_with_gt function. This is done here (only in stereo for now):

https://github.com/tianyilim/open_vins/blob/3d7cec971e78a82aa1fd6f9c535e1b79b2b79630/ov_msckf/src/ros/ROS1Visualizer.cpp#L583

  // check if we need to reinit
  if (reset_state) {
    (*reset_state)(0, 0) = timestamp;
    _app->initialize_with_gt(*reset_state);
    reset_state.reset(); // Clear this so we do not reinit again
    PRINT_INFO(YELLOW "Reinitialize VIOManager %s\n" RESET, _app->name.c_str());
    PRINT_DEBUG(YELLOW "Number of states after reset: IMU clones: %d, SLAM features: %d\n" RESET, _app->get_state()->_clones_IMU.size(),
               _app->get_state()->_features_SLAM.size());
  }

omeredemen commented 1 week ago

it is a good work thanks for sharing. i wanted to do a reset state like you did. but i haven't started yet. can you share what was the result you get with your updates?

goldbattle commented 15 hours ago

This might be the covariance is becoming ill-defined, or the statistical chi2 checks start to fail at that point. Could you take a look at that point and see if the filter stops getting feature updates (MSCKF and SLAM)?

I think one concern with your approach is that you will remove all the tracked features, so basically have non-continuous pose tracking. Probably a better method is to "update" with a fake GPS / mocap sensors to reduce the uncertainty of the system to remove this instability (if you want to avoid this issue with the global uncertainty being unbounded).

rpng / open_vins

Running OpenVINS for very long durations #481