Hi. First of all, thank you for sharing your great work.
I have two questions about your proposed method described in paper.
In Eq (1), L_2D is calculated as mean square error between heatmaps. But how could one can get ground truth heatmap? As far as I understand, heatmaps are the probability-encoded data for joints, and usually it is a byproduct(?) of pose estimation.
I think you may assume for instance normal distribution around ground truth 3D joints and project it to generate ground truth heatmap. And I guess your approach is similar as I described, since you mentioned in the paper "... the 3D lifting module can be trained independently using 3D mocap data and its projected heatmaps." (p.7732, 5.Architecture second paragraph)
I wonder If my guess is right and whether is it okay to use fixed size(standard deviation) to generate ground truth heatmaps.
Did you conduct a comparative study on embeddings dimension? I think it's quite small(20D).
I wonder if it has serious impact on performance and/or inference speed if I change the dimension of embeddings.
Hi. First of all, thank you for sharing your great work.
I have two questions about your proposed method described in paper.
I think you may assume for instance normal distribution around ground truth 3D joints and project it to generate ground truth heatmap. And I guess your approach is similar as I described, since you mentioned in the paper "... the 3D lifting module can be trained independently using 3D mocap data and its projected heatmaps." (p.7732, 5.Architecture second paragraph)
I wonder If my guess is right and whether is it okay to use fixed size(standard deviation) to generate ground truth heatmaps.
Did you conduct a comparative study on embeddings dimension? I think it's quite small(20D).
I wonder if it has serious impact on performance and/or inference speed if I change the dimension of embeddings.