Closed AliasChenYi closed 3 months ago
How to use GT 2D pose training.Can you help me?
@AliasChenYi Mentioned here before #5 What do you mean by real 2D pose estimation training?
We use the Stacked Hourglass 2D pose detection results and 2D ground truths on Human3.6M.The second item in your paper
Yeah the 2D ground truth is explained here #5.
I get it, thank you very much.
I have another doubt, the configuration file written on the data/motion2d/ this directory, but actually I do not have this directory, look at other code seems to be useless, can you ask me what it does?
Ohhh I believe that's from the MotionBERT that I forgot to delete. MotionBERT also has some 2D dataset that it uses them for pretraining such as PoseTrack and InstaVariety (see here for details). But I didn't use it. The data_root_2d on MotionAGFormer is useless.
Understood, then we use a GT-2D training, do not need to process cutting frames, such as 27 frames, 81 frames...
---Original--- From: "Soroush @.> Date: Wed, Jul 24, 2024 23:20 PM To: @.>; Cc: @.**@.>; Subject: Re: [TaatiTeam/MotionAGFormer] ground truth (Issue #46)
Ohhh I believe that's from the MotionBERT that I forgot to delete. MotionBERT also has some 2D dataset that it uses them for pretraining such as PoseTrack and InstaVariety (see here for details). But I didn't use it. The data_root_2d on MotionAGFormer is useless.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
For MotionAGFormer-XS and MotionAGformer-S we do.
For MotionAGFormer-XS and MotionAGformer-S we do.
MotionAGFormer-b and MotionAGFormer-L don't know if you did that? Do I have to do it again if I train here?
ohh, I mean for GT-2d data training, do not need to re-stacked Hourglass 2D data such as: python h36m.py --n-frames 243 such instructions?
For MotionAGFormer-XS and MotionAGformer-S we do.
We still have to do it because that preprocessing handles both 2D stacked hour glass + 3D ground truth. And the 2D ground truth is coming from 3D ground truth so still it's needed.
Ok, thank you very much for your reply, I will implement it right now.
Sorry to bother you, I would like to ask what this train_2d means. Why is it not available in xs and s, but in b and large versions? And my result in the process of reproduction is always lower than that of the paper. Can you give me some advice?
We still have to do it because that preprocessing handles both 2D stacked hour glass + 3D ground truth. And the 2D ground truth is coming from 3D ground truth so still it's needed.
In the Base version of the original paper, the P1 result was 38.4 and P2 was 32.6, but the P1 result I reproduced was 38.9 and P2 was 32.7, and the best effect was reached in the 17th round of training, and the training effect would only get worse and the fluctuation would be unstable. Is this a normal phenomenon?
@AliasChenYi Your first question: It's another hyperparameter of MotionBERT that I didn't use and forgot to delete. Forget about it (it is for including 2D dataset in pretraining that we don't have such a thing).
Second question: From my experiments a year back I noticed that when I change GPU from A40 to something else or when I change the batch size the final result is slightly worse. I believe you can't replicate the exact same result unless you have the same environment that I had.
And forgot to mention, it is ok to have fluctuations after first few epochs.
And forgot to mention, it is ok to have fluctuations after first few epochs.
Is it true that all versions achieve the best results through 90 epochs, because during the training process I observed that a good effect can be obtained within the first 20 epochs?
Honestly don't remember it. Unfortunately I accidentally deleted the logs of training and not sure about it.
Honestly don't remember it. Unfortunately I accidentally deleted the logs of training and not sure about it.
Thank you very much for your reply
Honestly don't remember it. Unfortunately I accidentally deleted the logs of training and not sure about it.
I apologize for disturbing you again, I would like to ask which part of the code is used to calculate the MCAs and MACs/frame metrics?
@AliasChenYi Answered here #16
For MACs/frame you can simply divide it by the number of frames you have as the input. The reason why I had such a thing is because some models (e.g. PoseFormerV2) have center frame prediction and need to have the forward procedure F times to have the same number of outputs.
For MACs/frame you can simply divide it by the number of frames you have as the input. The reason why I had such a thing is because some models (e.g. PoseFormerV2) have center frame prediction and need to have the forward procedure F times to have the same number of outputs.
Ohh, Thank you very much!
Hello, I would like to ask you what should be done in real 2D pose estimation training?