Thanks for your great work and congratulations! Since the unsup3d model is trained based on RGB input, what if I have RGBD human face dataset capture by commodity RGBD cameras, in which way should I add the supervision for the depth part to make full use of the depth information input?
Actually I have tried some unsupervised method such as Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set, which estimate the BFM parameters directly and use a differential renderer (my choice is Pytorch3D) for end-to-end weakly-supervised training. I tried to use L1/L2 loss between the rendered zbuffer and real depth map, but in this way the depth loss may conflit with other loss components(rgb loss, landmark loss).
Thanks for your great work and congratulations! Since the unsup3d model is trained based on RGB input, what if I have RGBD human face dataset capture by commodity RGBD cameras, in which way should I add the supervision for the depth part to make full use of the depth information input?
Actually I have tried some unsupervised method such as Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set, which estimate the BFM parameters directly and use a differential renderer (my choice is Pytorch3D) for end-to-end weakly-supervised training. I tried to use L1/L2 loss between the rendered zbuffer and real depth map, but in this way the depth loss may conflit with other loss components(rgb loss, landmark loss).
Any sugguestions on this?
Thanks