Open vhehduatks opened 3 months ago
@vhehduatks wow amazing find! I think you are absolutely correct. So for your reference, we translated the code from mo2cap2 dataset. In their dataset package, there is code.zip which contains code/mo2cap2_eval.m file, in that file there is:
for num_seq = start_frm:end_frm
% please download the test dataset and unzip the images to '../test/'
image_name = sprintf('../test_data/%s/%s%04d.%s', seq_name, img_prefix, num_seq, img_format);
% ** compute 3D joint positions with your own method **
% -------------------------------------------
joint3D = []; % your raw prediction (3-by-npart matrix)
% compute errors
% -------------------------------------------
% rescale and rigidly align the skeleton to ground truth
pose_gt_3d = squeeze(pose_gt(num_seq-frm_offset,:,:))';
pose_gt_3d = skeleton_rescale(pose_gt_3d, bone_length(2:end), kinematic_parents);
joint3D = skeleton_rescale(joint3D, bone_length(2:end), kinematic_parents);
[~,pose_gt_3d_rot,~] = procrustes(joint3D',pose_gt_3d','scaling',false);
pred(:,:,num_seq-frm_offset) = joint3D;
gt(:,:,num_seq-frm_offset) = pose_gt_3d_rot';
error = joint3D-pose_gt_3d_rot';
joint_error = squeeze(sqrt(sum(error.^2,1)));
mean_joint_error = mean(joint_error);
end
We can see that in the specific line
error = joint3D-pose_gt_3d_rot';
is updated from
joint3D = skeleton_rescale(joint3D, bone_length(2:end), kinematic_parents);
Please let us know if you have success with this change. @kimathikaai @sdhossain FYI there was a bug with mo2cap2 eval. See above
model:
desc: null
value:
type: TopdownPoseEstimator
data_preprocessor:
type: PoseDataPreprocessor
mean:
- 123.675
- 116.28
- 103.53
std:
- 58.395
- 57.12
- 57.375
bgr_to_rgb: true
backbone:
type: ResNet
depth: 101
init_cfg:
type: Pretrained
checkpoint: torchvision://resnet101
head:
type: CustomMo2Cap2Baselinel1
in_channels: 2048
out_channels: 15
loss:
type: KeypointMSELoss
use_target_weight: true
loss_weight: 1000
loss_pose_l2norm:
type: pose_l2norm
loss_weight: 1.0
loss_cosine_similarity:
type: cosine_similarity
loss_weight: 0.1
loss_limb_length:
type: limb_length
loss_weight: 0.5
loss_heatmap_recon:
type: KeypointMSELoss
use_target_weight: true
loss_weight: 500
decoder:
type: Custom_mo2cap2_MSRAHeatmap
input_size:
- 256
- 256
heatmap_size:
- 47
- 47
sigma: 2
dataset_mo2cap2_train:
desc: null
value:
type: Mo2Cap2CocoDataset
data_root: /home/jovyan/vol_arvr_hyeonghwan/Mo2cap2_dataset/extracted_mo2cap2_dataset/TrainSet
data_mode: topdown
filter_cfg:
filter_empty_gt: false
min_size: 32
pipeline:
- type: LoadImage
- type: GetBBoxCenterScale
padding: 1.0
- type: TopdownAffine
input_size:
- 256
- 256
- type: GenerateTarget
encoder:
type: Custom_mo2cap2_MSRAHeatmap
input_size:
- 256
- 256
heatmap_size:
- 47
- 47
sigma: 2
- type: PackPoseInputs
- type: MultiStepLR
begin: 0
end: 70000
milestones:
- 1000
- 2000
- 3000
- 4000
- 5000
- 6000
- 7000
- 8000
- 9000
- 10000
- 11000
- 12000
- 13000
- 14000
- 15000
- 16000
- 17000
- 18000
- 19000
- 20000
- 21000
- 22000
- 23000
- 24000
- 25000
- 26000
- 27000
- 28000
- 29000
- 30000
- 31000
- 32000
- 33000
- 34000
- 35000
- 36000
- 37000
- 38000
- 39000
- 40000
- 41000
- 42000
- 43000
- 44000
- 45000
- 46000
- 47000
- 48000
- 49000
- 50000
- 51000
- 52000
- 53000
- 54000
- 55000
- 56000
- 57000
- 58000
- 59000
- 60000
- 61000
- 62000
- 63000
- 64000
- 65000
- 66000
- 67000
- 68000
- 69000
gamma: 0.8
by_epoch: false
optim_wrapper:
desc: null
value:
optimizer:
type: AdamW
lr: 0.0005
I tried to reproduce the results in my experimental framework based on the code you posted on github, but the testset keeps overfitting (MPJPE never goes below 130). Scheduling with extremely low running rate (AdamW, init lr : 0.0005, *0.8 once every 1000iter) does not track outdoor's body joints at all. Is there something wrong with the config...or if you have any idea why the overfitting is happening, please let me know...I've spent almost 2 weeks trying to reproduce the results and it's very frustrating now.
dict_args = {
'model': 'mo2cap2_l1',
'eval': False,
'dataloader': 'mo2cap2',
'load': None,
'resume_from_checkpoint': None,
# 'dataset_tr': r'F:\extracted_mo2cap2_dataset\TrainSet',
'dataset_tr': '/home/jovyan/vol_arvr_hyeonghwan/Mo2cap2_dataset/extracted_mo2cap2_dataset/TrainSet',
'dataset_val': '/home/jovyan/vol_arvr_hyeonghwan/Mo2cap2_dataset/mo2cap2_data_half/ValSet',
'dataset_test': '/home/jovyan/vol_arvr_hyeonghwan/Mo2cap2_dataset/extracted_mo2cap2_dataset/TestSet',
'cuda': 'cuda',
'gpus': 1,
'batch_size': 100,
'epoch': 10,
'num_workers': 12,
'val_freq': 0.1,
'es_patience': 5,
'logdir': '/home/jovyan/vol_arvr_hyeonghwan/Ego-STAN/temp_res',
'lr': 0.001,
'load_resnet': '/home/jovyan/vol_arvr_hyeonghwan/Ego-STAN/resnet101-63fe2227.pth',
# 'hm_train_steps': 100000,
'hm_train_steps': 100000,
'seq_len': 5,
'skip': 0,
'encoder_type': 'branch_concat',
'heatmap_type': 'baseline',
'heatmap_resolution': [47, 47],
'image_resolution': [368, 368],
'seed': 42,
'clip_grad_norm': 0.0,
'dropout': 0.0,
'dropout_linear': 0.0,
'protocol': 'p2',
'w2c': False,
'weight_regularization': 0.01,
'monitor_metric': 'val_mpjpe_full_body',
'sigma': 3,
'h36m_sample_rate': 1,
'csv_mode': '3D'
}
Also, when I trained the model based on the code provided on github, I was able to see severe overfitting. Can you share the config information for mo2cap2L1 by any chance?
Hi @vhehduatks, do you mean reproducing our results on mo2cap2 or the results from the original mo2cap2 paper?
If you mean our results,you'll need to do the pretraining mentioned in . It's been a while so I can't remember where the code is, but I remember we followed closely to the description above, at least the description there is detailed.
If you mean the results from mo2cap2, we also had a very hard time trying to reproduce the results based solely on the description from the paper and was never successful.
https://github.com/jmpark0808/Ego-STAN/blob/77622fc0df608dac334214fb77448564539295fb/utils/evaluate.py#L544-L552
I have some questions about the part where you evaluate the result. In the part where you output the joint error, it says
Isn't gt_rot the result of gt_rescale transformed to best align with pred_rescale with gt modified by mean3D's scale? In other words, if gt_rot follows the scale of gt_rescale, then error should be
?