lastflowers / envio

Code for "Photometric Visual-Inertial Navigation with Uncertainty-Aware Ensembles" in TRO 2022
https://ieeexplore.ieee.org/document/9686364
GNU General Public License v3.0
139 stars 20 forks source link

Divergence #2

Closed engcang closed 2 years ago

engcang commented 2 years ago

Hi.

I am trying to run the code on KAIST VIO dataset and no luck yet. I used these following config and launch files

[Config file] ~~~python # euroc yaml file cam0: T_cam_imu: [0.0403, 0.0263, 0.9988, -0.0364, -0.9990, -0.0204, -0.0398, -0.1376, 0.0194, -0.9994, 0.0271, -0.0188, 0, 0, 0, 1.0] camera_model: pinhole distortion_coeffs: [0.006896928127777268, -0.009144207062654397, 0.000254113977103925, 0.0021434982252719545] distortion_model: radtan intrinsics: [380.9229090195708, 380.29264802262736, 324.68121181846755, 224.6741321466431] resolution: [640, 480] timeshift_cam_imu: -0.029958533056650416 cam1: T_cam_imu: [-0.0391, 0.0250, 0.9989, -0.0366, -0.9990, -0.0203, -0.0386, -0.1365, 0.0193, -0.9995, 0.0258, 0.0311, 0, 0, 0, 1.0] # T_cn_cnm1: # [0.9999992248836708, 0.00006384241340452582, 0.0012434452955667624, -0.049960282472300055, # -0.00006225102643531651, 0.9999991790958949, -0.0012798173093508036, -0.00005920119010064575, # -0.001243525981443161, 0.0012797389115975439, 0.9999984079544582, -0.000143160033953494487, # 0, 0, 0, 1.0] camera_model: pinhole distortion_coeffs: [0.007044055287844759, -0.010251485722185347, 0.0006674304399871926, 0.001678899816379666] distortion_model: radtan intrinsics: [380.95187095303424, 380.3065956074995, 324.0678433553536, 225.9586983198407] resolution: [640, 480] timeshift_cam_imu: -0.030340187355085417 ~~~
[launch file] ~~~xml ~~~
[compressed to raw image file] ~~~python #!/usr/bin/env python # -*- coding: utf-8 -*- import time import rospy import cv2 import numpy as np import sys import signal from sensor_msgs.msg import Image from sensor_msgs.msg import CompressedImage from sensor_msgs.msg import Imu from cv_bridge import CvBridge, CvBridgeError def signal_handler(signal, frame): # ctrl + c -> exit program print('You pressed Ctrl+C!') sys.exit(0) signal.signal(signal.SIGINT, signal_handler) class converter(): def __init__(self): rospy.init_node('compressed_to_raw', anonymous=True) self.comp_sub1 = rospy.Subscriber('/camera/infra1/image_rect_raw/compressed',CompressedImage,self.callback1) self.comp_sub2 = rospy.Subscriber('/camera/infra2/image_rect_raw/compressed',CompressedImage,self.callback2) self.imu_subscriber = rospy.Subscriber('/mavros/imu/data', Imu, self.imu_callback) self.img_pub1 = rospy.Publisher('/camera/infra1/img',Image,queue_size=100) self.img_pub2 = rospy.Publisher('/camera/infra2/img',Image,queue_size=100) self.imu_publisher = rospy.Publisher('/imu_data', Imu, queue_size=100) self.bridge = CvBridge() def callback1(self, data): try : np_arr = np.fromstring(data.data, np.uint8) image_np = cv2.imdecode(np_arr, cv2.IMREAD_COLOR) # OpenCV >= 3.0: cv_image=cv2.cvtColor(image_np, cv2.COLOR_BGR2GRAY) img=self.bridge.cv2_to_imgmsg(cv_image, "mono8") img.header.stamp = data.header.stamp # img.header.stamp = rospy.Time.now() self.img_pub1.publish(img) except CvBridgeError as e: pass def callback2(self, data): try : np_arr = np.fromstring(data.data, np.uint8) image_np = cv2.imdecode(np_arr, cv2.IMREAD_COLOR) # OpenCV >= 3.0: cv_image=cv2.cvtColor(image_np, cv2.COLOR_BGR2GRAY) img=self.bridge.cv2_to_imgmsg(cv_image, "mono8") img.header.stamp = data.header.stamp # img.header.stamp = rospy.Time.now() self.img_pub2.publish(img) except CvBridgeError as e: pass # def callback(self,data): # try : ## self.time = time.time() # cvimage=self.bridge.imgmsg_to_cv2(data,"bgr8") # cv_image=cv2.cvtColor(cvimage,cv2.COLOR_BGR2GRAY) # img=self.bridge.cv2_to_imgmsg(cv_image, "mono8") # img.header.stamp = rospy.Time.now() # self.img_publisher.publish(img) # #print(time.time()-self.time) # except CvBridgeError as e: # pass def imu_callback(self,data): # data.header.stamp = rospy.Time.now() new_data = data new_data.header.stamp = data.header.stamp self.imu_publisher.publish(data) if __name__=='__main__': cvt=converter() time.sleep(1) while 1: pass ~~~

I am pretty sure the config file (intrinsic and extrinsic) is correct. example video clip with the dataset But with any bag file of the dataset, envio diverges even with static starts.

Can you give me some comments to solve this issue?

Thank you in advance.

lastflowers commented 2 years ago

Hi, thanks for your interest in our work!

I have tested normal infinity rosbag file and envio works. It seems that the extrinsic parameters was not properly copied. I share the config and launch file. I have tuned some parameters: decreasing maximum depth for rather short stereo baseline and increasing R_std due to the flickering lights from the motion capture system. I hope this would help you and please ask me any questions !

example_kaistvio

<launch>
    <!--For the back-end -->
    <arg name="calibration_file" default="$(find ensemble_vio)/config/kaistvio/camchain-imucam-kaistvio.yaml"/>

    <node pkg="ensemble_vio" type="envio_node" name="envio_node">

        <rosparam command="load" file="$(arg calibration_file)" />

        <!-- Remapping : put your topics -->
        <remap from="/imu" to="/mavros/imu/data"/>
        <remap from="/left_image" to="/camera/infra1/image_rect_raw"/>
        <remap from="/right_image" to="/camera/infra2/image_rect_raw"/>

        <!-- Vision processing parameters -->
        <param name="nx" value="25" type="int"/>
        <param name="ny" value="15" type="int"/>
        <param name="min_depth" value="0.3" type="double"/>
        <param name="max_depth" value="5" type="double"/>
        <param name="min_parallax" value="1" type="double"/>
        <param name="ransac_thr" value="1" type="double"/>

        <!-- Initial std_dev [rad, m, m/s, m/s^2, rad/s] -->
        <param name="P0/attitude" value="0.0175" type="double"/>
        <param name="P0/position" value="3e-2" type="double"/>
        <param name="P0/velocity" value="1e-1" type="double"/>
        <param name="P0/ba" value="0.1962" type="double"/>
        <param name="P0/bg" value="1.0e-3" type="double"/>
        <param name="P0/depth" value="1.5" type="double"/>
        <param name="P0/idepth" value="0.1" type="double"/>
        <param name="num_init_samples" value="500" type="int"/>

        <!-- Process noises [rad/s^(1/2), m/s^(3/2), m/s^(5/2), rad/s^(3/2)]-->
        <param name="Q/velocity" value="2.3e-2" type="double"/>
        <param name="Q/attitude" value="1.0e-4" type="double"/>
        <param name="Q/ba" value="2.5e-3" type="double"/>
        <param name="Q/bg" value="7e-5" type="double"/>

        <!-- Estimator parameters -->
        <param name="inverse_depth" value="false" type="bool"/>
        <param name="R_std" value="20" type="double"/>
        <param name="max_lifetime" value="300" type="int"/>
        <param name="thr_stop" value="1e-3" type="double"/>
        <param name="max_diff" value="30" type="double"/>
        <param name="N_en" value="100" type="int"/>
        <param name="use_stochastic_gradient" value="true" type="bool"/>
        <param name="thr_weak" value="3" type="double"/>

        <!-- Sparse setting -->
        <param name="thr_num" value="150" type="int"/>
        <param name="uniform_dist" value="30" type="int"/>
        <param name="max_iter" value="10" type="int"/>

        <!-- Vehicle to IMU z-y-x euler sequence -->
        <param name="roll_imu_vehicle" value="0.0" type="double"/>
        <param name="pitch_imu_vehicle" value="0.0" type="double"/>
        <param name="yaw_imu_vehicle" value="0.0" type="double"/>

    </node>

</launch>
# kaistvio dataset yaml file

cam0:
  T_cam_imu:
    [-0.04030123999740945,   -0.9989998755524683,    0.01936643232049068,   0.02103955032447366,
      0.026311325355146964,  -0.020436499663524704, -0.9994448777394171,   -0.038224929976612206,
      0.9988410905708309,    -0.0397693113802049,    0.027108627033059024, -0.1363488241088845,
      0.0,                    0.0,                   0.0,                   1.0]
  cam_overlaps: [1]
  camera_model: pinhole
  distortion_coeffs: [0.006896928127777268, -0.009144207062654397, 0.000254113977103925, 0.0021434982252719545]
  distortion_model: radtan
  intrinsics: [380.9229090195708, 380.29264802262736, 324.68121181846755, 224.6741321466431]
  resolution: [640, 480]
  rostopic: /camera/infra1/image_rect_raw
  timeshift_cam_imu: -0.029958533056650416
cam1:
  T_cam_imu:
    [-0.03905752472566068,   -0.9990498568899562,     0.019336318430946575,   -0.02909273113160158,
      0.025035478432625047,  -0.020323396666370924,  -0.9994799569614147,     -0.03811090793611019,
      0.99892328763622,      -0.03855311914877835,    0.02580547271309183,    -0.13656684822705098,
      0.0,                    0.0,                    0.0,                     1.0]
  T_cn_cnm1: 
    [0.9999992248836708,      6.384241340452582e-05,   0.0012434452955667624,   -0.049960282472300055,
    -6.225102643531651e-05,   0.9999991790958949,     -0.0012798173093508036,   -5.920119010064575e-05,
    -0.001243525981443161,    0.0012797389115975439,   0.9999984079544582,      -0.00014316003395349448,
     0.0,                     0.0,                     0.0,                      1.0]
  cam_overlaps: [0]
  camera_model: pinhole
  distortion_coeffs: [0.007044055287844759, -0.010251485722185347, 0.0006674304399871926, 0.001678899816379666]
  distortion_model: radtan
  intrinsics: [380.95187095303424, 380.3065956074995, 324.0678433553536, 225.9586983198407]
  resolution: [640, 480]
  rostopic: /camera/infra2/image_rect_raw
  timeshift_cam_imu: -0.030340187355085417
engcang commented 2 years ago

Gotcha!!! It works well and sorry for the stupid mistakes.

I found there is slight delay between input data and output pose. How can I remove that? I think this is not the computational problem but the structure of the code maybe.

Why do I get the reversed pose against the ground truth data as attached?

sdf

lastflowers commented 2 years ago

No problem :)

The delay is directly related to the number of iteration in the filter update. I guess that the delay is originated from the threshold for the early break. I used 20 fps images, the threshold was set to 20 ms, but this would result a delay in the 30 fps images as in the dataset. You can set this parameter in the new commit. In my laptop (i7-7820), I don't see any remarkable delay with the 10 ms threshold.

https://github.com/lastflowers/envio/blob/1c251d8e6d2ac815f43cbee098501a3f9d7d5d0f/launch/nesl_envio_kaistvio.launch#L52

In envio, the z axis of the reference frame is defined as the gravity direction ("down"). So, if your ground-truth is referenced at the "up", this could happen. The easiest way to flip the z axis is to transform the Xi_ before publishing. Note that this is just for visualization.

https://github.com/lastflowers/envio/blob/1c251d8e6d2ac815f43cbee098501a3f9d7d5d0f/src/core/sl_iekf.cpp#L917

engcang commented 2 years ago

You are so nice. I appreciate again.

Final question: I guess flipping only Z axis is not enough but rather we have to rotate 180 degree around X axis considering the angles and Y axis value also. Am i on the right way?

lastflowers commented 2 years ago

You're welcome.

I am not sure what exactly the "180 degree rotation" means, but I guess you should multiply the below matrix to the envio position estimate, R^{enu}_{ned} = [0 1 0; 1 0 0; 0 0 -1]. Also, note that the initial yaw should be aligned also.

engcang commented 2 years ago

Thanks~ I successfully made it work and recorded a video clip to see the performance. I simply applied x' = x y' = -y z' = -z

"180 degree rotation" meant rolling 180 degree, since your coordinate in the Rviz from the above attached captured picture in in NED, not the NWU (ROS coordinate).

But as you said, to consider the roll pitch yaw also, the following rotational matrix should be multiplied, I guess. R = [1 0 0 0 -1 0 0 0 -1]

Thank you again.