How do you obtain the Ground Truth for 3D pose estimation? Is it derived from the model's detection results using 2D Ground Truth for training, or is it the original annotated 3D Ground Truth?
Additionally, I noticed that the Human3.6M dataset consists of individual images, while the code you provided is designed for 3D pose estimation regression on videos. Could you please advise on how to handle this situation? I look forward to your response if you have the time.
How do you obtain the Ground Truth for 3D pose estimation? Is it derived from the model's detection results using 2D Ground Truth for training, or is it the original annotated 3D Ground Truth? Additionally, I noticed that the Human3.6M dataset consists of individual images, while the code you provided is designed for 3D pose estimation regression on videos. Could you please advise on how to handle this situation? I look forward to your response if you have the time.