ethnhe / FFB6D

[CVPR2021 Oral] FFB6D: A Full Flow Bidirectional Fusion Network for 6D Pose Estimation.
MIT License
290 stars 72 forks source link

Visualization: Wrong Translation & Rotation #69

Closed MiriamJo closed 2 years ago

MiriamJo commented 2 years ago

I seem to have a problem with the visualization of my data. The center and keypoints are placed correctly over my object. However, it does not take into account the real position and translation. I add an image for easier explanation:

image

As you can see the mask is placed correctly over the object. However, the pose as translation of the placed object does not match the actual one a bit. I am confused on how this is possible, since I got the annotations and everything from blenderproc, it should be correct.

However, to create rendered and fused data, I used premade sampled RTs poses from other objects, since i couldn't find where in pvnet-rendering to produce them myself. I am not sure if this causes the error?

Maybe I did something wrong with the .ply export? I am not sure what it means to export it in meters. I just modeled the real thing and scaled it up to be 1000 times as big in blender. The measurements of distance etc were then correct.

MiriamJo commented 2 years ago

I managed to get to a point where my models are not shifted anymore. I guess my mistake was to not rename the intrinsic matrix for blender AND the dataset. As i proceeded I gradually understood more about intrinsic matrix and why there are 2 matrices.

(blender, for rendering the image aka the intrinsic of my renderes dataset AND dataset: the intrinsics of the dataset i already got) Since I created the dataset myself with blenderproc, all oft the matrices should be the same. So now as I changed every single matrix to the one I need, I got folling results:

keypoint_vis2

I just wanted to leave this here so anyone who needs this won't be confused aswell.

MiriamJo commented 2 years ago

Nevertheless, I still got an error when executing the linemod viosualization script in Windws Subsystem for Linux (WSL2). Only one image can open at at time, but i guess thats just an error for wsl.

nachi9211 commented 2 years ago

Nevertheless, I still got an error when executing the linemod viosualization script in Windws Subsystem for Linux (WSL2). Only one image can open at at time, but i guess thats just an error for wsl.

Hi Miriam, Would you please give me an introduction to how I could go about finding the intrinsic matrix for my dataset?

MiriamJo commented 2 years ago

Hello Nachi, Usually every dataset for object detection provides its own intrinsic matrix. These are the camera parameters the photos are taken with (could be a virtual or real camera), like the focal length, etc. What dataset are you working with? Datasets should always be denoted with intrinsics for every photo (better the same matrix for all photos in the dataset). To create your own dataset, you should always save the cam parameters for EVERY PHOTO.

Here is a really good blog post about the intrinsic camera matrix (and extrinsic, if you go back a page): Dissecting the Camera Matrix, Part 3: The Intrinsic Matrix

nachi9211 commented 1 year ago

Thank you @MiriamJo. That's helpful.