我新定义了一份数据集，来进行训练，在不到20epoch， eval出现出现 AP 和 AR全为0，请问怎么解决？

EbenezerO commented 2 years ago

新数据集是和coco类似的，多加入了5个关键点，训练过程中 loss 和 acc_pose 是正常收敛的，但是在eval 时候，AP 和AR在上升0.19时，出现骤降，然后变成0。

train log

2022-06-20 03:44:41,448 - mmpose - INFO - workflow: [('train', 1)], max: 210 epochs
2022-06-20 03:44:41,448 - mmpose - INFO - Checkpoints will be saved to /home/projects/open_mmpose/work_dirs/hrformer_base_sport22_384x288_udp_coarsedropout by HardDiskBackend.
2022-06-20 03:48:32,532 - mmpose - INFO - Epoch [1][50/54]  lr: 4.945e-05, eta: 14:29:36, time: 4.622, data_time: 0.049, memory: 29719, heatmap_loss: 0.0024, acc_pose: 0.2557, loss: 0.0024
2022-06-20 03:52:37,462 - mmpose - INFO - Epoch [2][50/54]  lr: 1.034e-04, eta: 13:49:47, time: 4.595, data_time: 0.048, memory: 29719, heatmap_loss: 0.0021, acc_pose: 0.4303, loss: 0.0021
2022-06-20 03:52:52,630 - mmpose - INFO - Saving checkpoint at 2 epochs
2022-06-20 03:53:09,250 - mmpose - INFO - Now best checkpoint is saved as best_AP_epoch_2.pth.
2022-06-20 03:53:09,250 - mmpose - INFO - Best AP is 0.1369 at 2 epoch.
2022-06-20 03:53:09,251 - mmpose - INFO - Epoch(val) [2][21]    AP: 0.1369, AP .5: 0.2762, AP .75: 0.1107, AP (M): 0.0171, AP (L): 0.1704, AR: 0.1624, AR .5: 0.2903, AR .75: 0.1562, AR (M): 0.0351, AR (L): 0.1963
2022-06-20 03:56:59,756 - mmpose - INFO - Epoch [3][50/54]  lr: 1.573e-04, eta: 13:35:26, time: 4.610, data_time: 0.048, memory: 29719, heatmap_loss: 0.0013, acc_pose: 0.6667, loss: 0.0013
2022-06-20 04:01:06,153 - mmpose - INFO - Epoch [4][50/54]  lr: 2.113e-04, eta: 13:26:56, time: 4.621, data_time: 0.049, memory: 29719, heatmap_loss: 0.0009, acc_pose: 0.7178, loss: 0.0009
2022-06-20 04:01:21,428 - mmpose - INFO - Saving checkpoint at 4 epochs
2022-06-20 04:01:36,877 - mmpose - INFO - The previous best checkpoint /home/projects/open_mmpose/work_dirs/hrformer_base_sport22_384x288_udp_coarsedropout/best_AP_epoch_2.pth was removed
2022-06-20 04:01:37,965 - mmpose - INFO - Now best checkpoint is saved as best_AP_epoch_4.pth.
2022-06-20 04:01:37,966 - mmpose - INFO - Best AP is 0.2212 at 4 epoch.
2022-06-20 04:01:37,966 - mmpose - INFO - Epoch(val) [4][21]    AP: 0.2212, AP .5: 0.2990, AP .75: 0.2397, AP (M): 0.0372, AP (L): 0.2700, AR: 0.2355, AR .5: 0.3124, AR .75: 0.2509, AR (M): 0.0415, AR (L): 0.2872
2022-06-20 04:05:29,426 - mmpose - INFO - Epoch [5][50/54]  lr: 2.652e-04, eta: 13:20:34, time: 4.628, data_time: 0.048, memory: 29719, heatmap_loss: 0.0008, acc_pose: 0.7457, loss: 0.0008
2022-06-20 04:09:35,891 - mmpose - INFO - Epoch [6][50/54]  lr: 3.192e-04, eta: 13:14:56, time: 4.624, data_time: 0.049, memory: 29719, heatmap_loss: 0.0008, acc_pose: 0.7531, loss: 0.0008
2022-06-20 04:09:51,136 - mmpose - INFO - Saving checkpoint at 6 epochs
2022-06-20 04:10:06,614 - mmpose - INFO - Epoch(val) [6][21]    AP: 0.2188, AP .5: 0.2861, AP .75: 0.2291, AP (M): 0.0262, AP (L): 0.2712, AR: 0.2357, AR .5: 0.3087, AR .75: 0.2460, AR (M): 0.0304, AR (L): 0.2903
2022-06-20 04:13:58,389 - mmpose - INFO - Epoch [7][50/54]  lr: 3.731e-04, eta: 13:10:04, time: 4.635, data_time: 0.047, memory: 29719, heatmap_loss: 0.0008, acc_pose: 0.7570, loss: 0.0008
2022-06-20 04:18:05,511 - mmpose - INFO - Epoch [8][50/54]  lr: 4.271e-04, eta: 13:05:28, time: 4.635, data_time: 0.048, memory: 29719, heatmap_loss: 0.0008, acc_pose: 0.7653, loss: 0.0008
2022-06-20 04:18:20,851 - mmpose - INFO - Saving checkpoint at 8 epochs
2022-06-20 04:18:36,206 - mmpose - INFO - Epoch(val) [8][21]    AP: 0.2199, AP .5: 0.2912, AP .75: 0.2241, AP (M): 0.0260, AP (L): 0.2727, AR: 0.2333, AR .5: 0.3050, AR .75: 0.2374, AR (M): 0.0327, AR (L): 0.2868
2022-06-20 04:22:28,122 - mmpose - INFO - Epoch [9][50/54]  lr: 4.810e-04, eta: 13:01:04, time: 4.637, data_time: 0.049, memory: 29719, heatmap_loss: 0.0008, acc_pose: 0.7779, loss: 0.0008
2022-06-20 04:26:35,418 - mmpose - INFO - Epoch [10][50/54] lr: 5.000e-04, eta: 12:56:48, time: 4.639, data_time: 0.047, memory: 29719, heatmap_loss: 0.0008, acc_pose: 0.7742, loss: 0.0008
2022-06-20 04:26:50,692 - mmpose - INFO - Saving checkpoint at 10 epochs
2022-06-20 04:27:08,872 - mmpose - INFO - The previous best checkpoint /home/projects/open_mmpose/work_dirs/hrformer_base_sport22_384x288_udp_coarsedropout/best_AP_epoch_4.pth was removed
2022-06-20 04:27:09,961 - mmpose - INFO - Now best checkpoint is saved as best_AP_epoch_10.pth.
2022-06-20 04:27:09,962 - mmpose - INFO - Best AP is 0.2233 at 10 epoch.
2022-06-20 04:27:09,963 - mmpose - INFO - Epoch(val) [10][21]   AP: 0.2233, AP .5: 0.2970, AP .75: 0.2394, AP (M): 0.0289, AP (L): 0.2743, AR: 0.2375, AR .5: 0.3112, AR .75: 0.2534, AR (M): 0.0357, AR (L): 0.2913
2022-06-20 04:31:01,742 - mmpose - INFO - Epoch [11][50/54] lr: 5.000e-04, eta: 12:52:34, time: 4.635, data_time: 0.046, memory: 29719, heatmap_loss: 0.0007, acc_pose: 0.7821, loss: 0.0007
2022-06-20 04:35:09,314 - mmpose - INFO - Epoch [12][50/54] lr: 5.000e-04, eta: 12:48:31, time: 4.645, data_time: 0.048, memory: 29719, heatmap_loss: 0.0007, acc_pose: 0.7985, loss: 0.0007
2022-06-20 04:35:24,584 - mmpose - INFO - Saving checkpoint at 12 epochs
2022-06-20 04:35:39,998 - mmpose - INFO - The previous best checkpoint /home/projects/open_mmpose/work_dirs/hrformer_base_sport22_384x288_udp_coarsedropout/best_AP_epoch_10.pth was removed
2022-06-20 04:35:41,163 - mmpose - INFO - Now best checkpoint is saved as best_AP_epoch_12.pth.
2022-06-20 04:35:41,163 - mmpose - INFO - Best AP is 0.2311 at 12 epoch.
2022-06-20 04:35:41,164 - mmpose - INFO - Epoch(val) [12][21]   AP: 0.2311, AP .5: 0.3063, AP .75: 0.2399, AP (M): 0.0378, AP (L): 0.2842, AR: 0.2498, AR .5: 0.3223, AR .75: 0.2571, AR (M): 0.0497, AR (L): 0.3031
2022-06-20 04:39:33,175 - mmpose - INFO - Epoch [13][50/54] lr: 5.000e-04, eta: 12:44:26, time: 4.640, data_time: 0.046, memory: 29719, heatmap_loss: 0.0007, acc_pose: 0.7989, loss: 0.0007
2022-06-20 04:43:40,756 - mmpose - INFO - Epoch [14][50/54] lr: 5.000e-04, eta: 12:40:25, time: 4.643, data_time: 0.049, memory: 29719, heatmap_loss: 0.0007, acc_pose: 0.8053, loss: 0.0007
2022-06-20 04:43:56,111 - mmpose - INFO - Saving checkpoint at 14 epochs
2022-06-20 04:44:11,473 - mmpose - INFO - Epoch(val) [14][21]   AP: 0.2253, AP .5: 0.2955, AP .75: 0.2421, AP (M): 0.0290, AP (L): 0.2798, AR: 0.2419, AR .5: 0.3112, AR .75: 0.2558, AR (M): 0.0409, AR (L): 0.2955
2022-06-20 04:48:03,546 - mmpose - INFO - Epoch [15][50/54] lr: 5.000e-04, eta: 12:36:24, time: 4.640, data_time: 0.048, memory: 29719, heatmap_loss: 0.0007, acc_pose: 0.7871, loss: 0.0007
2022-06-20 04:52:11,270 - mmpose - INFO - Epoch [16][50/54] lr: 5.000e-04, eta: 12:32:28, time: 4.648, data_time: 0.048, memory: 29719, heatmap_loss: 0.0007, acc_pose: 0.7999, loss: 0.0007
2022-06-20 04:52:26,688 - mmpose - INFO - Saving checkpoint at 16 epochs
2022-06-20 04:52:42,022 - mmpose - INFO - Epoch(val) [16][21]   AP: 0.2255, AP .5: 0.2888, AP .75: 0.2393, AP (M): 0.0339, AP (L): 0.2774, AR: 0.2418, AR .5: 0.3100, AR .75: 0.2509, AR (M): 0.0415, AR (L): 0.2952
2022-06-20 04:56:34,608 - mmpose - INFO - Epoch [17][50/54] lr: 5.000e-04, eta: 12:28:34, time: 4.652, data_time: 0.046, memory: 29719, heatmap_loss: 0.0007, acc_pose: 0.8016, loss: 0.0007
2022-06-20 05:00:42,320 - mmpose - INFO - Epoch [18][50/54] lr: 5.000e-04, eta: 12:24:39, time: 4.647, data_time: 0.048, memory: 29719, heatmap_loss: 0.0007, acc_pose: 0.8031, loss: 0.0007
2022-06-20 05:00:57,699 - mmpose - INFO - Saving checkpoint at 18 epochs
2022-06-20 05:01:13,063 - mmpose - INFO - Epoch(val) [18][21]   AP: 0.1913, AP .5: 0.2495, AP .75: 0.2032, AP (M): 0.0276, AP (L): 0.2347, AR: 0.2060, AR .5: 0.2632, AR .75: 0.2165, AR (M): 0.0292, AR (L): 0.2531
2022-06-20 05:05:05,821 - mmpose - INFO - Epoch [19][50/54] lr: 5.000e-04, eta: 12:20:47, time: 4.655, data_time: 0.047, memory: 29719, heatmap_loss: 0.0007, acc_pose: 0.7937, loss: 0.0007
2022-06-20 05:09:14,253 - mmpose - INFO - Epoch [20][50/54] lr: 5.000e-04, eta: 12:16:59, time: 4.661, data_time: 0.047, memory: 29719, heatmap_loss: 0.0007, acc_pose: 0.8232, loss: 0.0007
2022-06-20 05:09:29,643 - mmpose - INFO - Saving checkpoint at 20 epochs
2022-06-20 05:09:46,902 - mmpose - INFO - Epoch(val) [20][21]   AP: 0.1456, AP .5: 0.2068, AP .75: 0.1472, AP (M): 0.0187, AP (L): 0.1805, AR: 0.1569, AR .5: 0.2202, AR .75: 0.1587, AR (M): 0.0129, AR (L): 0.1953
2022-06-20 05:13:39,684 - mmpose - INFO - Epoch [21][50/54] lr: 5.000e-04, eta: 12:13:07, time: 4.655, data_time: 0.047, memory: 29719, heatmap_loss: 0.0007, acc_pose: 0.8172, loss: 0.0007
2022-06-20 05:17:48,125 - mmpose - INFO - Epoch [22][50/54] lr: 5.000e-04, eta: 12:09:18, time: 4.662, data_time: 0.048, memory: 29719, heatmap_loss: 0.0007, acc_pose: 0.8091, loss: 0.0007
2022-06-20 05:18:03,466 - mmpose - INFO - Saving checkpoint at 22 epochs
2022-06-20 05:18:18,749 - mmpose - INFO - Epoch(val) [22][21]   AP: 0.0127, AP .5: 0.0271, AP .75: 0.0099, AP (M): 0.0000, AP (L): 0.0159, AR: 0.0116, AR .5: 0.0308, AR .75: 0.0062, AR (M): 0.0000, AR (L): 0.0146
2022-06-20 05:22:11,280 - mmpose - INFO - Epoch [23][50/54] lr: 5.000e-04, eta: 12:05:24, time: 4.650, data_time: 0.047, memory: 29719, heatmap_loss: 0.0007, acc_pose: 0.8035, loss: 0.0007
2022-06-20 05:26:19,507 - mmpose - INFO - Epoch [24][50/54] lr: 5.000e-04, eta: 12:01:33, time: 4.657, data_time: 0.047, memory: 29719, heatmap_loss: 0.0007, acc_pose: 0.8063, loss: 0.0007
2022-06-20 05:26:34,932 - mmpose - INFO - Saving checkpoint at 24 epochs
2022-06-20 05:26:50,299 - mmpose - INFO - Epoch(val) [24][21]   AP: 0.0000, AP .5: 0.0000, AP .75: 0.0000, AP (M): 0.0000, AP (L): 0.0000, AR: 0.0000, AR .5: 0.0000, AR .75: 0.0000, AR (M): 0.0000, AR (L): 0.0000
2022-06-20 05:30:43,107 - mmpose - INFO - Epoch [25][50/54] lr: 5.000e-04, eta: 11:57:42, time: 4.656, data_time: 0.047, memory: 29719, heatmap_loss: 0.0007, acc_pose: 0.8187, loss: 0.0007
2022-06-20 05:34:50,949 - mmpose - INFO - Epoch [26][50/54] lr: 5.000e-04, eta: 11:53:48, time: 4.651, data_time: 0.047, memory: 29719, heatmap_loss: 0.0007, acc_pose: 0.8167, loss: 0.0007
2022-06-20 05:35:06,258 - mmpose - INFO - Saving checkpoint at 26 epochs
2022-06-20 05:35:21,728 - mmpose - INFO - Epoch(val) [26][21]   AP: 0.0000, AP .5: 0.0000, AP .75: 0.0000, AP (M): 0.0000, AP (L): 0.0000, AR: 0.0000, AR .5: 0.0000, AR .75: 0.0000, AR (M): 0.0000, AR (L): 0.0000
2022-06-20 05:39:14,460 - mmpose - INFO - Epoch [27][50/54] lr: 5.000e-04, eta: 11:49:55, time: 4.654, data_time: 0.047, memory: 29719, heatmap_loss: 0.0007, acc_pose: 0.7997, loss: 0.0007
2022-06-20 05:43:22,535 - mmpose - INFO - Epoch [28][50/54] lr: 5.000e-04, eta: 11:46:03, time: 4.653, data_time: 0.048, memory: 29719, heatmap_loss: 0.0007, acc_pose: 0.8163, loss: 0.0007
2022-06-20 05:43:37,914 - mmpose - INFO - Saving checkpoint at 28 epochs
2022-06-20 05:43:53,201 - mmpose - INFO - Epoch(val) [28][21]   AP: 0.0000, AP .5: 0.0000, AP .75: 0.0000, AP (M): 0.0000, AP (L): 0.0000, AR: 0.0000, AR .5: 0.0000, AR .75: 0.0000, AR (M): 0.0000, AR (L): 0.0000
2022-06-20 05:47:46,284 - mmpose - INFO - Epoch [29][50/54] lr: 5.000e-04, eta: 11:42:12, time: 4.661, data_time: 0.048, memory: 29719, heatmap_loss: 0.0007, acc_pose: 0.8101, loss: 0.0007
2022-06-20 05:51:54,499 - mmpose - INFO - Epoch [30][50/54] lr: 5.000e-04, eta: 11:38:20, time: 4.656, data_time: 0.047, memory: 29719, heatmap_loss: 0.0007, acc_pose: 0.8134, loss: 0.0007
2022-06-20 05:52:09,841 - mmpose - INFO - Saving checkpoint at 30 epochs
2022-06-20 05:52:25,123 - mmpose - INFO - Epoch(val) [30][21]   AP: 0.0000, AP .5: 0.0000, AP .75: 0.0000, AP (M): 0.0000, AP (L): 0.0000, AR: 0.0000, AR .5: 0.0000, AR .75: 0.0000, AR (M): 0.0000, AR (L): 0.0000

configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/sport22/hrformer_base_sport22_384x288_udp_coarsedropout.py

_base_ = [
    '../../../../_base_/default_runtime.py',
    '../../../../_base_/datasets/sport22.py'
]

checkpoint_config = dict(interval=2)
evaluation = dict(interval=2, metric='mAP', save_best='AP')

optimizer = dict(
    type='AdamW',
    lr=5e-4,
    betas=(0.9, 0.999),
    weight_decay=0.01,
    paramwise_cfg=dict(
        custom_keys={'relative_position_bias_table': dict(decay_mult=0.)}))

optimizer_config = dict(grad_clip=None)
# learning policy
lr_config = dict(
    policy='step',
    warmup='linear',
    warmup_iters=500,
    warmup_ratio=0.001,
    step=[170, 200])
total_epochs = 210
target_type = 'GaussianHeatmap'
log_config = dict(
    interval=50, hooks=[
        dict(type='TextLoggerHook'),
    ])

channel_cfg = dict(
    num_output_channels=22,
    dataset_joints=22,
    dataset_channel=[
        [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21],
    ],
    inference_channel=[
        0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21
    ])

# model settings
norm_cfg = dict(type='BN', requires_grad=True)  # distribute use SyncBN, else use BN
model = dict(
    type='TopDown',
    pretrained='https://download.openmmlab.com/mmpose/'
    'pretrain_models/hrformer_base-32815020_20220226.pth',
    backbone=dict(
        type='HRFormer',
        in_channels=3,
        norm_cfg=norm_cfg,
        extra=dict(
            drop_path_rate=0.3,
            with_rpe=True,
            stage1=dict(
                num_modules=1,
                num_branches=1,
                block='BOTTLENECK',
                num_blocks=(2, ),
                num_channels=(64, ),
                num_heads=[2],
                mlp_ratios=[4]),
            stage2=dict(
                num_modules=1,
                num_branches=2,
                block='HRFORMERBLOCK',
                num_blocks=(2, 2),
                num_channels=(78, 156),
                num_heads=[2, 4],
                mlp_ratios=[4, 4],
                window_sizes=[7, 7]),
            stage3=dict(
                num_modules=4,
                num_branches=3,
                block='HRFORMERBLOCK',
                num_blocks=(2, 2, 2),
                num_channels=(78, 156, 312),
                num_heads=[2, 4, 8],
                mlp_ratios=[4, 4, 4],
                window_sizes=[7, 7, 7]),
            stage4=dict(
                num_modules=2,
                num_branches=4,
                block='HRFORMERBLOCK',
                num_blocks=(2, 2, 2, 2),
                num_channels=(78, 156, 312, 624),
                num_heads=[2, 4, 8, 16],
                mlp_ratios=[4, 4, 4, 4],
                window_sizes=[7, 7, 7, 7]))),
    keypoint_head=dict(
        type='TopdownHeatmapSimpleHead',
        in_channels=78,
        out_channels=channel_cfg['num_output_channels'],
        num_deconv_layers=0,
        extra=dict(final_conv_kernel=1, ),
        loss_keypoint=dict(type='JointsMSELoss', use_target_weight=True)),
    train_cfg=dict(),
    test_cfg=dict(
        flip_test=True,
        post_process='default',
        shift_heatmap=False,
        target_type=target_type,
        modulate_kernel=17,
        use_udp=True)

data_root = 'data/sport22'
data_cfg = dict(
    image_size=[288, 384],
    heatmap_size=[72, 96],
    num_output_channels=channel_cfg['num_output_channels'],
    num_joints=channel_cfg['dataset_joints'],
    dataset_channel=channel_cfg['dataset_channel'],
    inference_channel=channel_cfg['inference_channel'],
    soft_nms=False,
    nms_thr=1.0,
    oks_thr=0.9,
    vis_thr=0.2,
    use_gt_bbox=False,
    det_bbox_thr=0.0,
    bbox_file=f'{data_root}/person_detection_results/COCO_val2017_detections_AP_H_56_person.json',
)

train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='TopDownGetBboxCenterScale', padding=1.25),
    dict(type='TopDownRandomShiftBboxCenter', shift_factor=0.16, prob=0.3),
    dict(type='TopDownRandomFlip', flip_prob=0.5),
    dict(
        type='TopDownHalfBodyTransform',
        num_joints_half_body=8,
        prob_half_body=0.3),
    dict(
        type='TopDownGetRandomScaleRotation', rot_factor=40, scale_factor=0.5),
    dict(type='TopDownAffine', use_udp=True),
    dict(type='ToTensor'),
    dict(
        type='NormalizeTensor',
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225]),
    dict(type='TopDownGenerateTarget',
         sigma=3,
         encoding='UDP',
         target_type=target_type),
    dict(
        type='Collect',
        keys=['img', 'target', 'target_weight'],
        meta_keys=[
            'image_file', 'joints_3d', 'joints_3d_visible', 'center', 'scale',
            'rotation', 'bbox_score', 'flip_pairs'
        ]),
]

val_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='TopDownGetBboxCenterScale', padding=1.25),
    dict(type='TopDownAffine', use_udp=True),
    dict(type='ToTensor'),
    dict(
        type='NormalizeTensor',
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225]),
    dict(
        type='Collect',
        keys=['img'],
        meta_keys=[
            'image_file', 'center', 'scale', 'rotation', 'bbox_score',
            'flip_pairs'
        ]),
]

test_pipeline = val_pipeline

data = dict(
    samples_per_gpu=14,  # 14 for 32G, 8/6 for 16G
    workers_per_gpu=2,
    val_dataloader=dict(samples_per_gpu=16),
    test_dataloader=dict(samples_per_gpu=16),
    train=dict(
        type='TopDownSport22Dataset',
        ann_file=f'{data_root}/annotations/new_train_v1.0.json',
        img_prefix=f'{data_root}/train2017/',
        data_cfg=data_cfg,
        pipeline=train_pipeline,
        dataset_info={{_base_.dataset_info}}),
    val=dict(
        type='TopDownSport22Dataset',
        ann_file=f'{data_root}/annotations/new_val_v1.0.json',
        img_prefix=f'{data_root}/val2017/',
        data_cfg=data_cfg,
        pipeline=val_pipeline,
        dataset_info={{_base_.dataset_info}}),
    test=dict(
        type='TopDownSport22Dataset',
        ann_file=f'{data_root}/annotations/new_val_v1.0.json',
        img_prefix=f'{data_root}/val2017/',
        data_cfg=data_cfg,
        pipeline=val_pipeline,
        dataset_info={{_base_.dataset_info}}),
)

configs/base/dataset/sport22.py

dataset_info = dict(
    dataset_name='sport22',
    keypoint_info={
        0:
        dict(name='nose', id=0, color=[51, 153, 255], type='upper', swap=''),
        1:
        dict(name='chin', id=1, color=[51, 153, 255], type='upper', swap=''),
        2:
        dict(
            name='left_eye',
            id=2,
            color=[51, 153, 255],
            type='upper',
            swap='right_eye'),
        3:
        dict(
            name='right_eye',
            id=3,
            color=[51, 153, 255],
            type='upper',
            swap='left_eye'),
        4:
        dict(
            name='left_ear',
            id=4,
            color=[51, 153, 255],
            type='upper',
            swap='right_ear'),
        5:
        dict(
            name='right_ear',
            id=5,
            color=[51, 153, 255],
            type='upper',
            swap='left_ear'),
        6:
        dict(
            name='left_shoulder',
            id=6,
            color=[0, 255, 0],
            type='upper',
            swap='right_shoulder'),
        7:
        dict(
            name='right_shoulder',
            id=7,
            color=[255, 128, 0],
            type='upper',
            swap='left_shoulder'),
        8:
        dict(
            name='left_elbow',
            id=8,
            color=[0, 255, 0],
            type='upper',
            swap='right_elbow'),
        9:
        dict(
            name='right_elbow',
            id=9,
            color=[255, 128, 0],
            type='upper',
            swap='left_elbow'),
        10:
        dict(
            name='left_wrist',
            id=10,
            color=[0, 255, 0],
            type='upper',
            swap='right_wrist'),
        11:
        dict(
            name='right_wrist',
            id=11,
            color=[255, 128, 0],
            type='upper',
            swap='left_wrist'),
        12:
        dict(
            name='left_hip',
            id=12,
            color=[0, 255, 0],
            type='lower',
            swap='right_hip'),
        13:
        dict(
            name='right_hip',
            id=13,
            color=[255, 128, 0],
            type='lower',
            swap='left_hip'),
        14:
        dict(
            name='left_knee',
            id=14,
            color=[0, 255, 0],
            type='lower',
            swap='right_knee'),
        15:
        dict(
            name='right_knee',
            id=15,
            color=[255, 128, 0],
            type='lower',
            swap='left_knee'),
        16:
        dict(
            name='left_ankle',
            id=16,
            color=[0, 255, 0],
            type='lower',
            swap='right_ankle'),
        17:
        dict(
            name='right_ankle',
            id=17,
            color=[255, 128, 0],
            type='lower',
            swap='left_ankle'),
        18:
            dict(
                name='left_heel',
                id=18,
                color=[0, 255, 0],
                type='lower',
                swap='right_heel'),
        19:
            dict(
                name='right_heel',
                id=19,
                color=[255, 128, 0],
                type='lower',
                swap='left_heel'),
        20:
            dict(
                name='left_tiptoe',
                id=20,
                color=[0, 255, 0],
                type='lower',
                swap='right_tiptoe'),
        21:
            dict(
                name='right_tiptoe',
                id=21,
                color=[255, 128, 0],
                type='lower',
                swap='left_tiptoe')
    },
    skeleton_info={
        0:
        dict(link=('left_ankle', 'left_knee'), id=0, color=[0, 255, 0]),
        1:
        dict(link=('left_knee', 'left_hip'), id=1, color=[0, 255, 0]),
        2:
        dict(link=('right_ankle', 'right_knee'), id=2, color=[255, 128, 0]),
        3:
        dict(link=('right_knee', 'right_hip'), id=3, color=[255, 128, 0]),
        4:
        dict(link=('left_hip', 'right_hip'), id=4, color=[51, 153, 255]),
        5:
        dict(link=('left_shoulder', 'left_hip'), id=5, color=[51, 153, 255]),
        6:
        dict(link=('right_shoulder', 'right_hip'), id=6, color=[51, 153, 255]),
        7:
        dict(
            link=('left_shoulder', 'right_shoulder'),
            id=7,
            color=[51, 153, 255]),
        8:
        dict(link=('left_shoulder', 'left_elbow'), id=8, color=[0, 255, 0]),
        9:
        dict(
            link=('right_shoulder', 'right_elbow'), id=9, color=[255, 128, 0]),
        10:
        dict(link=('left_elbow', 'left_wrist'), id=10, color=[0, 255, 0]),
        11:
        dict(link=('right_elbow', 'right_wrist'), id=11, color=[255, 128, 0]),
        12:
        dict(link=('left_eye', 'right_eye'), id=12, color=[51, 153, 255]),
        13:
        dict(link=('nose', 'left_eye'), id=13, color=[51, 153, 255]),
        14:
        dict(link=('nose', 'right_eye'), id=14, color=[51, 153, 255]),
        15:
        dict(link=('left_eye', 'left_ear'), id=15, color=[51, 153, 255]),
        16:
        dict(link=('right_eye', 'right_ear'), id=16, color=[51, 153, 255]),
        17:
        dict(link=('nose', 'chin'), id=17, color=[51, 153, 255]),
        18:
        dict(link=('right_heel', 'right_tiptoe'), id=18, color=[255, 128, 0]),
        19:
        dict(link=('right_heel', 'right_ankle'), id=19, color=[255, 128, 0]),
        20:
        dict(link=('right_ankle', 'right_tiptoe'), id=20, color=[255, 128, 0]),
        21:
        dict(link=('right_heel', 'right_tiptoe'), id=21, color=[0, 255, 0]),
        22:
        dict(link=('right_heel', 'right_ankle'), id=22, color=[0, 255, 0]),
        23:
        dict(link=('right_ankle', 'right_tiptoe'), id=23, color=[0, 255, 0])
    },
    joint_weights=[
        1., 1., 1., 1., 1., 1., 1., 1., 1.2, 1.2, 1.5, 1.5, 1., 1., 1.2, 1.2, 1.5,
        1.5, 1.5, 1.5, 1.5, 1.5
    ],
    sigmas=[
        0.026, 0.026, 0.025, 0.025, 0.035, 0.035, 0.079, 0.079, 0.072, 0.072, 0.062,
        0.062, 0.107, 0.107, 0.087, 0.087, 0.089, 0.089, 0.089, 0.089, 0.089, 0.089
    ])

mmpose/datasets/datasets/top_down/topdown_sport22_dataset.py


import os.path as osp
import tempfile
import warnings
from collections import OrderedDict, defaultdict

import json_tricks as json
import numpy as np
from mmcv import Config, deprecated_api_warning
from xtcocotools.cocoeval import COCOeval

from ....core.post_processing import oks_nms, soft_oks_nms
from ...builder import DATASETS
from ..base import Kpt2dSviewRgbImgTopDownDataset

@DATASETS.register_module()
class TopDownSport22Dataset(Kpt2dSviewRgbImgTopDownDataset):
    """SportDataset dataset for top-down pose estimation.

    Sport keypoint indexes::

        0: 'nose',
        1: 'chin',
        2: 'left_eye',
        3: 'right_eye',
        4: 'left_ear',
        5: 'right_ear',
        6: 'left_shoulder',
        7: 'right_shoulder',
        8: 'left_elbow',
        9: 'right_elbow',
        10: 'left_wrist',
        11: 'right_wrist',
        12: 'left_hip',
        13: 'right_hip',
        14: 'left_knee',
        15: 'right_knee',
        16: 'left_ankle',
        17: 'right_ankle'
        18: "left_heel",
        19: "right_heel",
        20: "left_tiptoe",
        21: "right_tiptoe"

    Args:
        ann_file (str): Path to the annotation file.
        img_prefix (str): Path to a directory where images are held.
            Default: None.
        data_cfg (dict): config
        pipeline (list[dict | callable]): A sequence of data transforms.
        dataset_info (DatasetInfo): A class containing all dataset info.
        test_mode (bool): Store True when building test or
            validation dataset. Default: False.
    """

    def __init__(self,
                 ann_file,
                 img_prefix,
                 data_cfg,
                 pipeline,
                 dataset_info=None,
                 test_mode=False):

        if dataset_info is None:
            warnings.warn(
                'dataset_info is missing. '
                'Check https://github.com/open-mmlab/mmpose/pull/663 '
                'for details.', DeprecationWarning)
            cfg = Config.fromfile('configs/_base_/datasets/sport22.py')
            dataset_info = cfg._cfg_dict['dataset_info']

        super().__init__(
            ann_file,
            img_prefix,
            data_cfg,
            pipeline,
            dataset_info=dataset_info,
            test_mode=test_mode)

        self.use_gt_bbox = data_cfg['use_gt_bbox']
        self.bbox_file = data_cfg['bbox_file']
        self.det_bbox_thr = data_cfg.get('det_bbox_thr', 0.0)
        self.use_nms = data_cfg.get('use_nms', True)
        self.soft_nms = data_cfg['soft_nms']
        self.nms_thr = data_cfg['nms_thr']
        self.oks_thr = data_cfg['oks_thr']
        self.vis_thr = data_cfg['vis_thr']

        self.db = self._get_db()

        print(f'=> num_images: {self.num_images}')
        print(f'=> load {len(self.db)} samples')

    def _get_db(self):
        """Load dataset."""
        if (not self.test_mode) or self.use_gt_bbox:
            # use ground truth bbox
            gt_db = self._load_coco_keypoint_annotations()
        else:
            # use bbox from detection
            gt_db = self._load_coco_person_detection_results()
        return gt_db

    def _load_coco_keypoint_annotations(self):
        """Ground truth bbox and keypoints."""
        gt_db = []
        for img_id in self.img_ids:
            gt_db.extend(self._load_coco_keypoint_annotation_kernel(img_id))
        return gt_db

    def _load_coco_keypoint_annotation_kernel(self, img_id):
        """load annotation from COCOAPI.

        Note:
            bbox:[x1, y1, w, h]

        Args:
            img_id: coco image id

        Returns:
            dict: db entry
        """
        img_ann = self.coco.loadImgs(img_id)[0]
        width = img_ann['width']
        height = img_ann['height']
        num_joints = self.ann_info['num_joints']

        ann_ids = self.coco.getAnnIds(imgIds=img_id, iscrowd=False)
        objs = self.coco.loadAnns(ann_ids)

        # sanitize bboxes
        valid_objs = []
        for obj in objs:
            if 'bbox' not in obj:
                continue
            x, y, w, h = obj['bbox']
            x1 = max(0, x)
            y1 = max(0, y)
            x2 = min(width - 1, x1 + max(0, w))
            y2 = min(height - 1, y1 + max(0, h))
            if ('area' not in obj or obj['area'] > 0) and x2 > x1 and y2 > y1:
                obj['clean_bbox'] = [x1, y1, x2 - x1, y2 - y1]
                valid_objs.append(obj)
        objs = valid_objs

        bbox_id = 0
        rec = []
        for obj in objs:
            if 'keypoints' not in obj:
                continue
            if max(obj['keypoints']) == 0:
                continue
            if 'num_keypoints' in obj and obj['num_keypoints'] == 0:
                continue
            joints_3d = np.zeros((num_joints, 3), dtype=np.float32)
            joints_3d_visible = np.zeros((num_joints, 3), dtype=np.float32)

            keypoints = np.array(obj['keypoints']).reshape(-1, 3)
            joints_3d[:, :2] = keypoints[:, :2]
            joints_3d_visible[:, :2] = np.minimum(1, keypoints[:, 2:3])

            image_file = osp.join(self.img_prefix, self.id2name[img_id])
            rec.append({
                'image_file': image_file,
                'bbox': obj['clean_bbox'][:4],
                'rotation': 0,
                'joints_3d': joints_3d,
                'joints_3d_visible': joints_3d_visible,
                'dataset': self.dataset_name,
                'bbox_score': 1,
                'bbox_id': bbox_id
            })
            bbox_id = bbox_id + 1

        return rec

    def _load_coco_person_detection_results(self):
        """Load coco person detection results."""
        num_joints = self.ann_info['num_joints']
        all_boxes = None
        with open(self.bbox_file, 'r') as f:
            all_boxes = json.load(f)

        if not all_boxes:
            raise ValueError('=> Load %s fail!' % self.bbox_file)

        print(f'=> Total boxes: {len(all_boxes)}')

        kpt_db = []
        bbox_id = 0
        for det_res in all_boxes:
            if det_res['category_id'] != 1:
                continue

            image_file = osp.join(self.img_prefix,
                                  self.id2name[det_res['image_id']])
            box = det_res['bbox']
            score = det_res['score']

            if score < self.det_bbox_thr:
                continue

            joints_3d = np.zeros((num_joints, 3), dtype=np.float32)
            joints_3d_visible = np.ones((num_joints, 3), dtype=np.float32)
            kpt_db.append({
                'image_file': image_file,
                'rotation': 0,
                'bbox': box[:4],
                'bbox_score': score,
                'dataset': self.dataset_name,
                'joints_3d': joints_3d,
                'joints_3d_visible': joints_3d_visible,
                'bbox_id': bbox_id
            })
            bbox_id = bbox_id + 1
        print(f'=> Total boxes after filter '
              f'low score@{self.det_bbox_thr}: {bbox_id}')
        return kpt_db

    @deprecated_api_warning(name_dict=dict(outputs='results'))
    def evaluate(self, results, res_folder=None, metric='mAP', **kwargs):
        """Evaluate coco keypoint results. The pose prediction results will be
        saved in ``${res_folder}/result_keypoints.json``.

        Note:
            - batch_size: N
            - num_keypoints: K
            - heatmap height: H
            - heatmap width: W

        Args:
            results (list[dict]): Testing results containing the following
                items:

                - preds (np.ndarray[N,K,3]): The first two dimensions are \
                    coordinates, score is the third dimension of the array.
                - boxes (np.ndarray[N,6]): [center[0], center[1], scale[0], \
                    scale[1],area, score]
                - image_paths (list[str]): For example, ['data/coco/val2017\
                    /000000393226.jpg']
                - heatmap (np.ndarray[N, K, H, W]): model output heatmap
                - bbox_id (list(int)).
            res_folder (str, optional): The folder to save the testing
                results. If not specified, a temp folder will be created.
                Default: None.
            metric (str | list[str]): Metric to be performed. Defaults: 'mAP'.

        Returns:
            dict: Evaluation results for evaluation metric.
        """
        metrics = metric if isinstance(metric, list) else [metric]
        allowed_metrics = ['mAP']
        for metric in metrics:
            if metric not in allowed_metrics:
                raise KeyError(f'metric {metric} is not supported')

        if res_folder is not None:
            tmp_folder = None
            res_file = osp.join(res_folder, 'result_keypoints.json')
        else:
            tmp_folder = tempfile.TemporaryDirectory()
            res_file = osp.join(tmp_folder.name, 'result_keypoints.json')

        kpts = defaultdict(list)

        for result in results:
            preds = result['preds']
            boxes = result['boxes']
            image_paths = result['image_paths']
            bbox_ids = result['bbox_ids']

            batch_size = len(image_paths)
            for i in range(batch_size):
                image_id = self.name2id[image_paths[i][len(self.img_prefix):]]
                kpts[image_id].append({
                    'keypoints': preds[i],
                    'center': boxes[i][0:2],
                    'scale': boxes[i][2:4],
                    'area': boxes[i][4],
                    'score': boxes[i][5],
                    'image_id': image_id,
                    'bbox_id': bbox_ids[i]
                })
        kpts = self._sort_and_unique_bboxes(kpts)

        # rescoring and oks nms
        num_joints = self.ann_info['num_joints']
        vis_thr = self.vis_thr
        oks_thr = self.oks_thr
        valid_kpts = []
        for image_id in kpts.keys():
            img_kpts = kpts[image_id]
            for n_p in img_kpts:
                box_score = n_p['score']
                kpt_score = 0
                valid_num = 0
                for n_jt in range(0, num_joints):
                    t_s = n_p['keypoints'][n_jt][2]
                    if t_s > vis_thr:
                        kpt_score = kpt_score + t_s
                        valid_num = valid_num + 1
                if valid_num != 0:
                    kpt_score = kpt_score / valid_num
                # rescoring
                n_p['score'] = kpt_score * box_score

            if self.use_nms:
                nms = soft_oks_nms if self.soft_nms else oks_nms
                keep = nms(img_kpts, oks_thr, sigmas=self.sigmas)
                valid_kpts.append([img_kpts[_keep] for _keep in keep])
            else:
                valid_kpts.append(img_kpts)

        self._write_coco_keypoint_results(valid_kpts, res_file)

        # do evaluation only if the ground truth keypoint annotations exist
        if 'annotations' in self.coco.dataset:
            info_str = self._do_python_keypoint_eval(res_file)
            name_value = OrderedDict(info_str)

            if tmp_folder is not None:
                tmp_folder.cleanup()
        else:
            warnings.warn(f'Due to the absence of ground truth keypoint'
                          f'annotations, the quantitative evaluation can not'
                          f'be conducted. The prediction results have been'
                          f'saved at: {osp.abspath(res_file)}')
            name_value = {}

        return name_value

    def _write_coco_keypoint_results(self, keypoints, res_file):
        """Write results into a json file."""
        data_pack = [{
            'cat_id': self._class_to_coco_ind[cls],
            'cls_ind': cls_ind,
            'cls': cls,
            'ann_type': 'keypoints',
            'keypoints': keypoints
        } for cls_ind, cls in enumerate(self.classes)
                     if not cls == '__background__']

        results = self._coco_keypoint_results_one_category_kernel(data_pack[0])

        with open(res_file, 'w') as f:
            json.dump(results, f, sort_keys=True, indent=4, allow_nan=True)

    def _coco_keypoint_results_one_category_kernel(self, data_pack):
        """Get coco keypoint results."""
        cat_id = data_pack['cat_id']
        keypoints = data_pack['keypoints']
        cat_results = []

        for img_kpts in keypoints:
            if len(img_kpts) == 0:
                continue

            _key_points = np.array(
                [img_kpt['keypoints'] for img_kpt in img_kpts])
            key_points = _key_points.reshape(-1,
                                             self.ann_info['num_joints'] * 3)

            result = [{
                'image_id': img_kpt['image_id'],
                'category_id': cat_id,
                'keypoints': key_point.tolist(),
                'score': float(img_kpt['score']),
                'center': img_kpt['center'].tolist(),
                'scale': img_kpt['scale'].tolist()
            } for img_kpt, key_point in zip(img_kpts, key_points)]

            cat_results.extend(result)

        return cat_results

    def _do_python_keypoint_eval(self, res_file):
        """Keypoint evaluation using COCOAPI."""
        coco_det = self.coco.loadRes(res_file)
        coco_eval = COCOeval(self.coco, coco_det, 'keypoints', self.sigmas)
        coco_eval.params.useSegm = None
        coco_eval.evaluate()
        coco_eval.accumulate()
        coco_eval.summarize()

        stats_names = [
            'AP', 'AP .5', 'AP .75', 'AP (M)', 'AP (L)', 'AR', 'AR .5',
            'AR .75', 'AR (M)', 'AR (L)'
        ]

        info_str = list(zip(stats_names, coco_eval.stats))

        return info_str

    def _sort_and_unique_bboxes(self, kpts, key='bbox_id'):
        """sort kpts and remove the repeated ones."""
        for img_id, persons in kpts.items():
            num = len(persons)
            kpts[img_id] = sorted(kpts[img_id], key=lambda x: x[key])
            for i in range(num - 1, 0, -1):
                if kpts[img_id][i][key] == kpts[img_id][i - 1][key]:
                    del kpts[img_id][i]

        return kpts```

mm-assistant[bot] commented 2 years ago

We recommend using English or English & Chinese for issues so that we could have broader discussion.

EbenezerO commented 2 years ago

收到，谢谢！

EbenezerO commented 2 years ago

@liqikai9 @ly015 @piercus @lindahua @zhiqwang @motokimura

jin-s13 commented 2 years ago

建议优先检查一下（1） label 标注是否准确（2）configs/base/dataset/sport22.py 写的是不是对的

多训练一段时间再看看，会不会恢复回来。

EbenezerO commented 2 years ago

建议优先检查一下（1） label 标注是否准确（2）configs/base/dataset/sport22.py 写的是不是对的

多训练一段时间再看看，会不会恢复回来。

@jin-s13 谢谢，但是这些已经进行了确认，才来向官方寻求帮助。这种情况非常奇怪！

(1) lable 已经检查，保证了标注和 configs/base/dataset/sport22.py关键点下标位置是一致的，调用cocoapi show_anns 图片显示人体关键点是正常的,确定了标注人体关键点的位置无误。 (2）模型已经训练到18epoch 发生骤降 22 epoch达到0，后面训练到40 epoch 仍然保持在0。

jin-s13 commented 2 years ago

HRFormer 按照我们的经验，训练有一点不稳定。也同步试一下 res50 的模型吧~ 看看是不是模型的问题

EbenezerO commented 2 years ago

HRFormer 按照我们的经验，训练有一点不稳定。也同步试一下 res50 的模型吧~ 看看是不是模型的问题

好的谢谢 I try it

ShaneCan commented 2 years ago

这是我的跟你的情况是否一样？尚未解决。。。

EbenezerO commented 2 years ago

@jin-s13 谢谢，我尝试了基于CNN的模型，它是正常工作的：loss逐渐收敛，AP 也是逐渐上升的，符合预期。

但是为什么会发生这种奇怪的情况呢？HRFormer 是针对coco的17个点来构造的吗？导致了其不适用与其他关键点数量，出现变成 AP 变成0 的情况

jin-s13 commented 2 years ago

具体原理并不是特别清楚。只是经验上看，感觉transformer based 模型不是非常稳定。。。 HRFormer 理论上也并不局限于coco-17点。

ly015 commented 2 years ago

@liqikai9 @ly015 @piercus @lindahua @zhiqwang @motokimura

感谢使用 mmpose 并提出 issue。我们会定时查看 issue 并及时回复或 assign 给相关的开发者。之后可以不必这样手动@用户，其中一些并非 mmpose 开发者。

EbenezerO commented 2 years ago

@liqikai9 @ly015 @piercus @lindahua @zhiqwang @motokimura

感谢使用 mmpose 并提出 issue。我们会定时查看 issue 并及时回复或 assign 给相关的开发者。之后可以不必这样手动@用户，其中一些并非 mmpose 开发者。

好的

jin-s13 commented 2 years ago

这个讨论，我会移到 github Discussion 里

open-mmlab / mmpose