Closed Bob130 closed 5 years ago
If the camera intrinsic parameters are not available, the depth images cannot be converted to the point cloud. If you want to use orthographical perspective, then just lift the depth images and converting them to voxel representation should work.
Can I estimate 3D hand pose (uvd) in ASL Finger Spelling Dataset using the pretrained models? Note that the resolution of the given depth images is different and there is no camera configuration. Thanks in advance~