Closed davidpagnon closed 2 months ago
Hi @pixelite1201 and co,
Now I'm thinking of giving up on using the npz data for the camera and body positions and locations, and rather use the be_seq.csv file. Once I'm sure the positions and orientations are good, I'll tackle pose, shape, and scale, but I suppose I will just have to retrieve the information from the npz file. Is there a readme file or something that explains the formalism be_seq.csv uses?
If I understand well, Group corresponds to the camera extrinsic parameters, and Body corresponds to the mesh ones. Both Group and Body, X, Y, Z are in centimeters, and Yaw, Pitch, Roll in degrees. Bodies only have X,Y, and Yaw parameters.
Does it sound accurate? If so, I still have a problem because the camera is not looking at the person at all. Do the location axes need to be rotated? In which order do the angles need to be taken? Are they centered around the world origin or around the camera/body?
Hi again,
I made some progress but I am still stuck with sub-optimal results: the body meshes do not perfectly overlay the image. Do you see anything wrong with my method? I need it to be exact to the pixel.
Could the issue be due to:
be_seq.csv
, what do x_offset=650; [...] cam_x_offset=10.0;cam_y_offset=10.0;cam_z_offset=5.0 refer to?Thank you in advance!
be_seq.csv
be_seq.csv
: location X, location Y, and Yaw.
seq_000000_camera.csv
: X, Y, Z, Roll, Pitch, Yaw (in that order), hfov.
The test case I am working on is 20221010_3_1000_batch01hand
, seq_000000
.
be_seq.csv
gives:
Index,Type,Body,X,Y,Z,Yaw,Pitch,Roll,Comment
0,Comment,None,0,0,0,0,0,0,bodies_min=3;bodies_max=3;x_offset=650;y_offset=0.0;z_offset=0.0;x_min=-50;x_max=50;y_min=-250;y_max=250;yaw_min=0;yaw_max=360;cam_x_offset=10.0;cam_y_offset=10.0;cam_z_offset=5.0;cam_yaw_min=-3;cam_yaw_max=3;cam_pitch_min=-10;cam_pitch_max=3;cam_roll_min=-3;cam_roll_max=3;cam_config=cam_random_e
1,Group,None,5.058579444123886,-9.743741006226163,168.9532350186906,1.468126863951408,-2.9050680463167105,2.8139950143546857,sequence_name=seq_000000;frames=128;hdri=abandoned_church;camera_hfov=52.0
2,Body,rp_claudia_posed_005_1001,618.9016947055262,102.08905542916352,0.0,161.95890288786055,0.0,0.0,start_frame=1;texture_body=skin_f_white_04_ALB;texture_clothing=texture_09
3,Body,rp_beatrice_posed_025_1076,696.7017401790371,-229.90220980456667,0.0,314.3671548169336,0.0,0.0,start_frame=69;texture_body=skin_f_asian_09_ALB;texture_clothing=texture_16
4,Body,rp_cindy_posed_005_1097,684.9980306666062,-54.09495558334399,0.0,103.42616917612702,0.0,0.0,start_frame=65;texture_body=skin_f_indian_10_ALB;texture_clothing=texture_06
seq_000000_camera.csv
gives:
name,x,y,z,yaw,pitch,roll,focal_length,sensor_width,sensor_height,hfov
seq_000000_0000.png,5.058579,-9.743741,168.953232,1.468127,-2.905068,2.813995,36.905,36,20.25,52
Always use the dedicated camera ground truth files (seq_XXXXXX_camera.csv
) which we provide for retrieving camera extrinsics/intrinsics. There are many sequences in the dataset where the camera is moving. The be_seq.csv
file does not provide camera ground truth for all types of shots in BEDLAM.
Camera ground truth .csv files use Unreal coordinate system to describe camera world space location and rotation.
local
Y-axis, then roll around new local
X-axis. Note that this matches the order in the CSV file columns.
Wow Unreal coordinate system is quite unusual! The weird transformations I was doing make a bit more sense now, thanks. I'll experiment it next week and see if it fixes it all! Best regards,
Make sure that you also enable SMPL-X pose correctives when using the Blender add-on to import animations. See Notes section at https://github.com/PerceivingSystems/bedlam_render/tree/main/blender/smplx_anim_to_alembic
Thank you @tpsmpi !
So I implemented the transformations you specified, and it seems like it is exactly equivalent to what I did above -- but at least, now it makes sense. However, there is still this little offset.
I'm wondering if this might be an issue with the positioning of the body mesh. I enabled pose correctives, but it won't change location and rotation. There is this line I don't understand in the be_seq.csv file:
0,Comment,None,0,0,0,0,0,0,bodies_min=3;bodies_max=3;x_offset=650;y_offset=0.0;z_offset=0.0;x_min=-50;x_max=50;y_min=-250;y_max=250;yaw_min=0;yaw_max=360;cam_x_offset=10.0;cam_y_offset=10.0;cam_z_offset=5.0;cam_yaw_min=-3;cam_yaw_max=3;cam_pitch_min=-10;cam_pitch_max=3;cam_roll_min=-3;cam_roll_max=3;cam_config=cam_random_e
What does x_offset=650;y_offset=0.0;z_offset=0.0
mean?
And cam_x_offset=10.0;cam_y_offset=10.0;cam_z_offset=5.0
?
cam_config=cam_random_e
?
On other sequences, I see on the camera lines of be_seq.csv that I also don't understand:
cameraroot_x=3350.0;cameraroot_y=1050.0;cameraroot_z=70.0
?
cameraroot_yaw=131.48002763593706
?
Line index 0 ist just FYI comment about the randomization parameters. The cameraroot parameters are used to setup the camera rig in Unreal for rendering. See also: https://github.com/PerceivingSystems/bedlam_render/issues/10 But always use the dedicated camera ground truth files for anything camera related. They contain all you need.
Nevermind, I figured it out! As I had to rotate my Blender camera so that it would face the +X direction, I faced some annoying brain teaser when trying to rotate around the new local axes, but it's all good now!
Thank you for taking the time to write this very clear file and for your help on this thread, I have been stuck on it for quite a while.
I can't reopen an issue and I'm not sure how the notification system works on closed ones, so I'm posting this as a new one. Please pardon me if this has already been seen. Here was my question:
I'm still kind of fuzzy about why and how the camera extrinsic parameters change from person to person on the same image. Why do we need to change both the position of the camera (npz_params['cam_ext']) and of the body (npz_params['trans_world'])?
In any case, since this is what I ultimately need, is there a way to get the camera calibration in the world coordinate system, and idem for each SMPL-X mesh?