average reconstruction error problem

Hi, @melonwan . I've been researching on Hand Tracking work in recently, and i'm very interested in your crossingnet work. I try to follow your project, to train crossingnet with NYU dataset and MSRA dataset. I can get average reconstruction error 0.198221(NYU) and 0.078984(MSRA). Are this training results resonable? As I don't find the detail data in your paper. And then I try to validate these two trained models with my own depth image from Orbbec Astra Mini device. But both of trained models cannot get a good result. so i wonder which reason leads to the bad performance. Is the generalization capability of crossingnet or other reasons, such as hand detector and center location methods in crossingnet project is not suitable for depth image of Orbbec Astra Mini device or resolution ratio of Orbbec Astra Mini's depth image is lower than Kinect and Intel. I do hand detector and center location work with the function of this project, just modifing the Camera Parameter. Finally, thanks again for sharing your amazing work. I'm looking forward very much for your suggestions and reply.

Hey, thank you for your interest. The reconstruction error is used to check the quality of generated image by GAN decoder, but was found not very related to the visual quality. For testing on real world dataset, I agree with you that both generalization capacity of the network is limited and the hand detector plays a critical role to the final accuracy. Regarding the generalization, crossing-net is limited to the noise of depth camera, hand shape and pose provided by the training dataset. As this work is proposed almost 2 years ago and the current hand pose estimation is moving forward quite quickly, my suggestion would be change to current state-of-art, e.g., denseReg or V2V-PoseNet_RELEASE available on github.

melonwan / crossingNet

average reconstruction error problem #4