I believe the loss calculation at line 155 is wrong. The slice should go up to index 3, not 6.
That's because the dimensions for the jaw_pose are 3.
I would like to remind you that pred_poses shape is (N, seq_length, 103), where the first 3 dimensions are for the jaw_pose while the rest 100 are for the expression.
For the gt_poses the shape is (N, seq_length, 265) where the first 3 dimensions are for the jaw pose and the last 100 are for expression.
The 3 next dimensions after the first 3 of the jaw pose are for the left eye.
When we do MSELoss = torch.mean(torch.abs(pred_poses[:, :, :6] - gt_poses[:, :, :6])) we compare correctly the first 3 jaw_pose features but also we compare 3 left eye features from gt_poses with 3 features expression features from pred_poses.
Proposed Fix:
I believe the correct way to calculate the loss is by changing 6 to 3, as follows:
MSELoss = torch.mean(torch.abs(pred_poses[:, :, :3] - gt_poses[:, :, :3])).
Please let me know if my assertion is correct or whether I misunderstood something.
I would like to report a possible miscalculation of the loss in the face generator.
Issue description
Please have a look at the following code snippet: https://github.com/yhw-yhw/TalkSHOW/blob/38aab300b0aba6fc631ad139f62a6cea87261a0c/nets/smplx_face.py#L155-L159
I believe the loss calculation at line 155 is wrong. The slice should go up to index 3, not 6. That's because the dimensions for the
jaw_pose
are 3.I would like to remind you that
pred_poses
shape is(N, seq_length, 103)
, where the first 3 dimensions are for the jaw_pose while the rest 100 are for the expression.For the
gt_poses
the shape is(N, seq_length, 265)
where the first 3 dimensions are for the jaw pose and the last 100 are for expression. The 3 next dimensions after the first 3 of the jaw pose are for the left eye.When we do
MSELoss = torch.mean(torch.abs(pred_poses[:, :, :6] - gt_poses[:, :, :6]))
we compare correctly the first 3 jaw_pose features but also we compare 3 left eye features fromgt_poses
with 3 features expression features frompred_poses
.Proposed Fix:
I believe the correct way to calculate the loss is by changing 6 to 3, as follows:
MSELoss = torch.mean(torch.abs(pred_poses[:, :, :3] - gt_poses[:, :, :3]))
.Please let me know if my assertion is correct or whether I misunderstood something.