alexsax / 2D-3D-Semantics

The data skeleton from Joint 2D-3D-Semantic Data for Indoor Scene Understanding
http://3dsemantics.stanford.edu
Apache License 2.0
464 stars 67 forks source link

Projecting the perspective images to the semantic mesh (Semantic.obj) #26

Closed mboussah closed 5 years ago

mboussah commented 5 years ago

Hi,

I am trying to project the available 2D perspective images (using their corresponding poses in folder area_x/data/pose) onto the provided semantic mesh (area_x/3d/semantic_obj) but I can't find the correct camera parameters to do that. The documentation is not clear regarding this issue . I tried different combinations using the so called "camera_rt_matrix", "final_camera_rotation", "camera_original_rotation"... still can't find the correct pose. I am using the mve environment to visualize both mesh and the camera poses. For instance this is an illustration when using the pose provided in camera_rt_matrix pose_illustration

Any help would be much appreciated @ir0

Thanks

alexsax commented 5 years ago

Hi, thanks for the question!

You are looking at the right type of data. The correct extrinsic parameters corresponding to the perspective image can be found in matrix form via camera_rt_matrix, or alternatively by using location + Euler angles via final_camera_rotation (XYZ Euler) and the camera_location (XYZ location). These parameters should be the same for all meshes (i.e. identical in semantic.obj, semantic_pretty.obj, and rgb.obj).

I just confirmed the process by doing this manually in Blender 2.79 for the following pair (left: RGB image, right: screen cap of a visualization of the camera in Blender, using pose info from the corresponding JSON file):

area_3/data/rgb/camera_842facf372454735b7fd2880e14de97f_lounge_2_frame_4_domain_rgb.png area_3/data/pose/camera_842facf372454735b7fd2880e14de97f_lounge_2_frame_4_domain_pose.json
camera_842facf372454735b7fd2880e14de97f_lounge_2_frame_4_domain_rgb image

Here are the steps that I used. 1) Load Area 3's semantic.obj file into Blender using the default setting. Note that Blender loads meshes using the following convention, which may be different from the software that you are using:

image

2) Set the camera location to [9.506696, -2.915594, 1.420641] and the Euler angles to [1.8459843397140503, 0.007588082458823919, 1.4350531101226807] (using the XYZ convention)

ALTERNATIVELY, I did step 2 by using the camera_rt_matrix. In this case I set camera.matrix_world = camera_rt_matrix.invert(). I also needed to decrease the camera's elevation by 180 degrees, and then the the two methods (Euler, RT matrix) were equivalent.

After this I used the screen capture to produce the images shown above. Does this solve your issue?

alexsax commented 5 years ago

I'm closing this for now, but please reopen if this did not solve your issue.