Closed mrgloom closed 5 years ago
The input is the RGB image stacked alongside the previously produced heatmaps.
And what is output? Is it just 68 landmark heatmaps? i.e. it don't predict depthmap? i.e. like https://github.com/AaronJackson/vrn https://github.com/1adrianb/face-alignment/blob/f0cf27cf7c9141f567e135c21495702cdf0aefe3/face_alignment/models.py#L219
Aaron's work predict a 3D voxelized volume (DxHxW). This network outputs 68 points, each containing the depth at a given location (x,y).
What is input and output of depth network?
https://github.com/1adrianb/face-alignment/blob/87a496b158ff9a215aa6f48262f4e13d8e6c4dd7/face_alignment/api.py#L49