RobotLocomotion / LabelFusion

LabelFusion: A Pipeline for Generating Ground Truth Labels for Real RGBD Data of Cluttered Scenes
http://labelfusion.csail.mit.edu
Other
387 stars 98 forks source link

coordinate system origin #61

Closed hygxy closed 5 years ago

hygxy commented 5 years ago

Hi , I would to ask about the origin of coordinate system where the generated pose :

I anticipated that the camera is the origin, but from this demo picture(left down corner) it seems not.

posequestion

Of course I don't know exactly where your camera is, but I've observed a similar situation in my own case where I know for sure that the coordinate system shown at left down corner of the picture is not my camera coordinate system(both location and orientation are not coincided with my camera) ownquestion

So in which coordinate system does the generated pose lie? thanks in advanced.

patmarion commented 5 years ago

Hi,

ElasticFusion performs a scene reconstruction and saves a scene pointcloud. This pointcloud is a set of xyz points and they are all in the camera coordinate system taken from the 1st camera frame.

Aligned object poses are stored relative to the scene pointcloud, so that means it is in the camera coordinate system of the first frame of data capture. The object pose stored in the yaml file should be the transform that you can apply to the object mesh to align it with the scene pointcloud.

However, the LabelFusion object alignment tool loads the scene pointcloud and then re-orients it for convenience. It orients the pointcloud by seraching for a dominate plane, like a table surface. This is the coordinate system you see in the visualization you posted. In this coordinate system the Z axis points up. This transform can be arbitrary, and it is stored in the yaml config file under the key "firstFrameToWorld". This reorientation is implemented in the python method rotateReconstructionToStandardOrientation.

Hope this helps! It's been a little while since I've worked on LabelFusion so my memory might not be perfect, please double check this information to confirm it looks correct compared to what you are seeing.

hygxy commented 5 years ago

Hi, I did test based on your description, but got unexpected result. Basically I did the following:

  1. Generation of a bounding box of a model mesh with meshlab and noted down the 8 corner coordinates as shown in this screenshot: modelmesh p0(-20,0,19.9878) p1(-20,0,-19.9878) p2(20,0,-19.9878) p3(20,0,19.9878) p4(-20,112,19.9878) p5(-20,112,-19.9878), p6(20,112,-19.9878) p7(20,112,19.9878) these points are the corners of the nearest and farthest faces(from coordinate system origin to 112 offset in Y direction) on bounding box and ordered counterclockwise

  2. Using Labelfusion to create training datasets. From xxx_color_labels.png I can see that the alignment is correct. e.g.: 0000000030_color_labels

  3. Reprojection of the eight corners using the generated pose in xxx_poses.yaml. I used ''from scipy.spatial.transform import Rotation as R'' to convert quaternion to rotation matrix. I also noticed that the "from_quat()" function of "scipy.spatial.transform" package takes a quaternion in (x,y,z,w) format but the generated poses are in (w,x,y,z) format. The modification and reprojection is as follows: quaternion_list = meta['metal']['pose'][1] #0.translation 1.rotation modified_quaternion = [ quaternion_list[1], quaternion_list[2], quaternion_list[3], quaternion_list[0]] my_r = R.from_quat(modified_quaternion).as_dcm() my_t = np.resize(np.array(meta['metal']['pose'][0]),(3,1)) model = np.array(model) /1000 target = np.dot(my_r, model.T) target = np.add(target, my_t.T).T p0 = (int((target[0][0]/ target[0][2])*fx + cx), int((target[0][1]/ target[0][2])*fy + cy)) p1 = (int((target[1][0]/ target[1][2])*fx + cx), int((target[1][1]/ target[1][2])*fy + cy)) ... p7 = (int((target[7][0]/ target[7][2])*fx + cx), int((target[7][1]/ target[7][2])*fy + cy))

  4. Drawing lines(use OpenCV) to connect the reprojected corner points: cv2.line(img, p0,p1,(255,255,255), 2) cv2.line(img, p0,p3,(255,255,255), 2) ... cv2.line(img, p6,p7,(255,255,255), 2)

  5. Showing results, but I got something like this: reprojected It seems that there is a positive offset in the Y direction for every picture.

    I am wondering why the xxx_color_labels.png is correct but the bounding box not, do you use any other reprojection method? Any advice would be appreciated!

hygxy commented 5 years ago

I've found that the origin of the mesh is not the same as shown in meshlab when the mesh is imported in director python used in LabelFusion, that's why there is an offset in the Y direction. I know the reason for this issue now, so I decide to close it, but I still don't understand why there is an offset and don't know how much is the offset.

patmarion commented 5 years ago

Is it possible that meshlab is adding an offset? I checked the labelfusion code and I don't see any code where an additional offset is added. There are some applications in Director where we re-center meshes by subtracting the centroid, but I don't think this is occurring anywhere in the LabelFusion pipeline.

Rather than compute a bounding box in meshlab, could you try reading your mesh file (is it ascii ply ?) and writing down some of the mesh vertices and use those. Does meshlab recenter the mesh? As another test, try adding or subtracting the mesh centroid from your bounding box points.