uzh-rpg / DSEC

MIT License
138 stars 17 forks source link

Events and frame alignment #25

Closed UtharaKeerthan closed 3 years ago

UtharaKeerthan commented 3 years ago

Hi, I used the events from the events_left hdf file and images_rectified_left to get the events and images from the zurich_city_04_b data set. I used the rectify.h5 file to get the rectified events. But still I get the image like below, rectified_events May I know what transformations have to be applied on the events or the image files so we get the aligned image and the events ?

I understand that we need to convert the 2d point of an event to a 3d point and do the transformation T01 mentioned in the cam_cam.yaml file and get the 2d point on the other frame again. Could you provide an example for this?

I'm working with rosbags. Could you please share the code for getting the rosbag as mentioned in #Issue12, https://download.ifi.uzh.ch/rpg/tmp/interlaken_00_kalibr.bag. ?

Thanks in advance.

magehrig commented 3 years ago

Hi @UtharaKeerthan The calibration (intrinsics and extrinsics) data is available for each sequence on the DSEC webpage: https://dsec.ifi.uzh.ch/dsec-datasets/download/ You should find the answer to your question in the closed issues https://github.com/uzh-rpg/DSEC/issues/12 and https://github.com/uzh-rpg/DSEC/issues/14. What people did so far is a very simple warping by assuming that the scene is infinitely far away. This kind of works but is not so accurate for close objects.

I will later see if I can upload the rosbag again. Out of curiosity, why do you need/want it? This rosbag is only the calibration recording and will most likely not give you better results for the calibration.

UtharaKeerthan commented 3 years ago

Thanks for the response, I will look into the issues #12 and #14. I thought the rosbag contains the events and the images of the left event and frame cameras combined. My mistake that I did not notice the rosbag is for calibration recording.

magehrig commented 3 years ago

If you want to work with rosbags, there is a tool to convert the event data in the DSEC format into rosbag format: https://github.com/tub-rip/events_h52bag

Note, that this will increase storage requirements drastically

magehrig commented 3 years ago

I see that some people struggle with the warping that I explained in some issues here, so I created a simple code snippet from which you can start. It maps the left image to the event camera frame by assuming that the scene is far away. You may have to adapt the directory structure to your case. Also note that you don't actually need the full Transformation class, purely working with rotation works in this case. I just implemented that class in another project and used it for convenience. Hope that helps.

import os
import argparse
from pathlib import Path

import numpy as np
import cv2
from omegaconf import OmegaConf
from scipy.spatial.transform import Rotation as Rot

class Transform:
    def __init__(self, translation: np.ndarray, rotation: Rot):
        if translation.ndim > 1:
            self._translation = translation.flatten()
        else:
            self._translation = translation
        assert self._translation.size == 3
        self._rotation = rotation

    @staticmethod
    def from_transform_matrix(transform_matrix: np.ndarray):
        translation = transform_matrix[:3, 3]
        rotation = Rot.from_matrix(transform_matrix[:3, :3])
        return Transform(translation, rotation)

    @staticmethod
    def from_rotation(rotation: Rot):
        return Transform(np.zeros(3), rotation)

    def R_matrix(self):
        return self._rotation.as_matrix()

    def R(self):
        return self._rotation

    def t(self):
        return self._translation

    def T_matrix(self) -> np.ndarray:
        return self._T_matrix_from_tR(self._translation, self._rotation.as_matrix())

    def q(self):
        # returns (x, y, z, w)
        return self._rotation.as_quat()

    def euler(self):
        return self._rotation.as_euler('xyz', degrees=True)

    def __matmul__(self, other):
        # a (self), b (other)
        # returns a @ b
        #
        # R_A | t_A   R_B | t_B   R_A @ R_B | R_A @ t_B + t_A
        # --------- @ --------- = ---------------------------
        # 0   | 1     0   | 1     0         | 1
        #
        rotation = self._rotation * other._rotation
        translation = self._rotation.apply(other._translation) + self._translation
        return Transform(translation, rotation)

    def inverse(self):
        #           R_AB  | A_t_AB
        # T_AB =    ------|-------
        #           0     | 1
        #
        # to be converted to
        #
        #           R_BA  | B_t_BA    R_AB.T | -R_AB.T @ A_t_AB
        # T_BA =    ------|------- =  -------|-----------------
        #           0     | 1         0      | 1
        #
        # This is numerically more stable than matrix inversion of T_AB
        rotation = self._rotation.inv()
        translation = - rotation.apply(self._translation)
        return Transform(translation, rotation)

if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('seqpath')

    args = parser.parse_args()

    seqpath = Path(args.seqpath)
    assert seqpath.is_dir()
    print(f'start processing: {seqpath}')

    confpath = seqpath / 'calibration' / 'cam_to_cam.yaml'
    assert confpath.exists()
    conf = OmegaConf.load(confpath)

    images_left_dir = seqpath / 'images' / 'left'
    outdir = images_left_dir / 'ev_inf'
    os.makedirs(outdir, exist_ok=True)

    image_in_dir = images_left_dir / 'rectified'

    # Get mapping for this sequence:

    K_r0 = np.eye(3)
    K_r0[[0, 1, 0, 1], [0, 1, 2, 2]] = conf['intrinsics']['camRect0']['camera_matrix']
    K_r1 = np.eye(3)
    K_r1[[0, 1, 0, 1], [0, 1, 2, 2]] = conf['intrinsics']['camRect1']['camera_matrix']

    R_r0_0 = Rot.from_matrix(np.array(conf['extrinsics']['R_rect0']))
    R_r1_1 = Rot.from_matrix(np.array(conf['extrinsics']['R_rect1']))

    T_r0_0 = Transform.from_rotation(R_r0_0)
    T_r1_1 = Transform.from_rotation(R_r1_1)
    T_1_0 = Transform.from_transform_matrix(np.array(conf['extrinsics']['T_10']))

    T_r1_r0 = T_r1_1 @ T_1_0 @ T_r0_0.inverse()
    R_r1_r0_matrix = T_r1_r0.R().as_matrix()
    P_r1_r0 = K_r1 @ R_r1_r0_matrix @ np.linalg.inv(K_r0)

    ht = 480
    wd = 640
    # coords: ht, wd, 2
    coords = np.stack(np.meshgrid(np.arange(wd), np.arange(ht)), axis=-1)
    # coords_hom: ht, wd, 3
    coords_hom = np.concatenate((coords, np.ones((ht, wd, 1))), axis=-1)
    # mapping: ht, wd, 3
    mapping = (P_r1_r0 @ coords_hom[..., None]).squeeze()
    # mapping: ht, wd, 2
    mapping = (mapping/mapping[..., -1][..., None])[..., :2]
    mapping = mapping.astype('float32')

    for entry in image_in_dir.iterdir():
        assert entry.suffix == '.png'
        image_out_file = outdir / entry.name
        if image_out_file.exists():
            continue

        image_in = cv2.imread(str(entry))
        image_out = cv2.remap(image_in, mapping, None, interpolation=cv2.INTER_CUBIC)
        cv2.imwrite(str(image_out_file), image_out)

    print(f'done processing: {seqpath}')
UtharaKeerthan commented 3 years ago

Thanks for the help @magehrig . My rosbag of left events and left images are aligned now. rct_evts_img_remap

danqu130 commented 2 years ago

@magehrig Using this code, I found that there are some missing at the bottom of the aligned image, just like the picture above. I think this is because these areas in the original image do not exist. I notice that these areas have events. Is there have disparity and optical flow gt in test set? And will this negatively affect the performance of the image-based algorithms?

magehrig commented 2 years ago

@danqu130 Just saw your question now. For the future: No, there is no optical flow gt in this area.

cameralabhw commented 1 year ago

Hi @UtharaKeerthan, could you share your code for aligning this event and RGB data? I'm having some issues with this and would appreciate your help! By the way, how do you build your event frames that you have on the overlay image?

magehrig commented 1 year ago

@cameralabhw I posted some code a while ago https://github.com/uzh-rpg/DSEC/issues/25#issuecomment-956491275 is this not what you need?

cameralabhw commented 1 year ago

@magehrig Hi, what I was trying to do was aligning the event frames with the RGB frames, while maintaining the RGB resolution. Your code downscales the RGB to match the event resolution, is this done to match the field of views or could this aligning be done without resizing the RGB?

In my case I want to map the event to the rgb image so the rgb camera would works as a "main" camera.

magehrig commented 1 year ago

Assuming that you want to know for each event where it should be placed in the frame assuming infinite depth you can do the following:

That's it

magehrig commented 1 year ago

If you have some kind of dense representation of event data it is probably a better idea to warp that event representation directly to the frames. In this case, you would have to change the approach a bit and find for each pixel in the frame the corresponding location in the event representation and sample/interpolate there.

I discussed this aspect in this comment a while ago: https://github.com/uzh-rpg/DSEC/issues/14#issuecomment-841348958

LLHEVER commented 1 year ago

u Hi, I want to whether this code is suitable for the mapping the right image to the right event camera frame?