Closed Bob130 closed 5 years ago
Hi @Bob130 I think this should be possible. Since they use the Kinect camera for the ASL dataset, you can use some off-the-shelf camera configuration parameters to start with. In terms of sensor characteristics, the NYU pretrained model should fit. For the different resolution, you need to run the preprocessing/hand detection step to prepare the input data for the network.
Can I estimate 3D hand pose (uvd) in ASL Finger Spelling Dataset using the pretrained models? Note that the resolution of the given depth images is different and there is no camera configuration. Thanks in advance~