microsoft / voxelpose-pytorch

Official implementation of "VoxelPose: Towards Multi-Camera 3D Human Pose Estimation in Wild Environment"
MIT License
480 stars 90 forks source link

Camera parameters processing #30

Open AntonioEscamilla opened 2 years ago

AntonioEscamilla commented 2 years ago

You explicitly mention on the readme, that you have processed the camera parameters to your formats. Can you explain what process was made? Particularly, on the camera translation parameter?

From the original Campus dataset one can obtain the translation T for each camera, but it is way too different from the ones you provide in the json calib file. As example I got T=[-1.787557e+00, 1.361094e+00, 5.226973e+00] from the original data for cam 0, but in the json file you use T = [1774.89, -5051.69, 1923.35]. What special consideration should be made to obtain such values? How should they be interpreted?

I would appreciate if you can elaborate more on it. Thank you for your time!!

wenwen-zhi commented 1 year ago

who know this?

mateuszk098 commented 1 month ago

The translation matrix defined in the JSON file is given by: -np.linalg.inv(R) @ T * 1000. R is the rotation matrix you can obtain using values from producePmat.m and formulas from getRT.m. T is the translation vector defined in producePmat.m. I don't know exactly why this transformation is required but I found it here -> https://github.com/AlvinYH/Faster-VoxelPose/issues/21#issuecomment-1784128654. Multiplying by 1000 is probably because they use millimeters as the length measure, but calibration parameters are usually considered in the metric system. When you recalculate T you should get exactly the same values as they have in the JSON file.