better-flow / evimo

A toolkit for dataset generation with event-based cameras
GNU General Public License v3.0
45 stars 8 forks source link

Camera pose of EVIMO v2 #25

Closed qimaqi closed 1 year ago

qimaqi commented 1 year ago

Hi I am kind of confused by the NPZ file in EVIMO v2. I observed that in the dataset_info.npz I can get the camera position in different ts. However when I plot the Flea camera trajectory and the Samsung camera trajectory. Their relative position differs a lot. And this relative position is totally different with the extrinsic. Could you explain how you get the camera position in the dataset_info.npz?

aftersomemath commented 1 year ago

Hi,

The camera poses are multiplied with their first transform so that the first pose is always identity. This choice is a holdover from EVIMO1 and EVIMO2v1 and is done in the C++ code that generates the txt version of the dataset. Unfortunately, it results in confusing data as you mentioned. However, the data is valid.

I will see if I can update the documentation to make this idiosyncrasy clear.

Please let me know if you find anything else confusing!

qimaqi commented 1 year ago

Hello and thanks for the fast response! I am sure the data is valid and the documentation in EVIMO v2 is one of the best I have seen. Actually what I am going to do now is that I want to project the events from Samsung to the Flea camera frame. And according to the documentation I found 2 ways to do it. First is I can use the extrinsic parameters provided in the raw data so I know the relative transform between Samsung and Flea. (Said in website "https://better-flow.github.io/evimo/docs/raw-sequence-structure.html" 4 lines; first line: 6 parameters tx ty tz roll pitch yaw - Euler angles for camera-to-Vicon).
One more question about this euler angle transform is I do not know the order of the rotation. Is this rotation exactly roll pitch yaw: xyz or order?

And I also observed in the data_info.npz I can also get the camera pose from the key 'meta'. (there is key 'cam' which provide the pose like https://better-flow.github.io/evimo/docs/ground-truth-format.html#evimo2v2-txt-vs-evimo2v2-npz shows: 'cam': {'pos': {'q': {'w': 1.0, 'x': 3.6e-05, 'y': 0.000342, 'z': 0.000158}, 'rpy': {'p': 0.000683, 'r': 7.2e-05, 'y': 0.000316}, 't': {'x': -0.000103, 'y': -0.000202, 'z': 2.9e-05}}) So what I did is I drew the trajectory based on the data_info for Samsung and Flea and find the trajectory difference is changing. So I suppose as you answered the camera poses are multiplied with their first transform and the first is identity. Which means in the data_info.npz of Samsung and Flea camera the poses are not in the same coordinate system? If that is the case then what I should do is use the extrinsic provided.

aftersomemath commented 1 year ago

I am very glad to hear that you find the documentation good! Thank you.

You are correct that the Samsung and Flea trajectories are not in the same coordinate frame due to the initial multiplication and so the extrinsics must be used to find the transform between two frames. Tonight I will upload an NPZ version of these extrinsics that you can extract on top of the NPZ version of the dataset. I will also upload the code that generates that overlay.

We have used this extrinsic overlay to do reproject the RGB data into the event camera frame (the opposite of what you want to do). I think I can upload this code as well.

However, in the independent motion sequences, the extrinsics are not enough information to transform RGB to event camera or event camera to RGB. You also need to take into account the time-varying transformations of the objects. The RGB to event camera code I will upload takes care of this properly.

qimaqi commented 1 year ago

Thanks a lot! That will be perfect if you can also provide the code which can generate this overlay! Also thanks for the heads up about the time-varying transformation.

aftersomemath commented 1 year ago

Here is the code to generate the extrinsics from the raw data.

Here is the extrinsics overlay generated by that code. Just extract it on top of the NPZ version of the dataset.

Here is where evimo_flow.py reprojects flea3 RGB data into the event camera frame. Following the calculations upwards will show how the motion of both the objects and the displacement between the two cameras was accounted for.

Hope that helps!

qimaqi commented 1 year ago

It perfectly solves my questions, thanks! One last thing is that event cameras like Prophesse will also generate the bias file containing the contrast sensitivity for the experiments. I wonder if it is also available in this amazing dataset. Thanks again

aftersomemath commented 1 year ago

Unfortunately, I cannot find any bias files.

Here is the version of the prophesee ros wrapper that was used. You can see that the launch files specify no bias file by default. I checked our startup scripts that run the launch files and confirmed no bias file was set either.

I'm not familiar enough with the prophesee API to say what biases were used since none were set explicitly by the startup scripts. However, I can confirm from my own experience that both prophesee cameras performed extremely similarly and so their settings were probably probably very similar.