open-mmlab / mmpose

OpenMMLab Pose Estimation Toolbox and Benchmark.
https://mmpose.readthedocs.io/en/latest/
Apache License 2.0
5.6k stars 1.22k forks source link

How Do I used a COCO Annotated Custom Keypoint Dataset for Custom Objects in MMPose #1029

Open AliButtar opened 2 years ago

AliButtar commented 2 years ago

Hi,

I have annotated a custom dataset in COCO Format with keypoints. It's a square object where the keypoints are the corners. My simple question is how do I use this annotated data directly with MMPose in the simplest way as it is already in COCO Format?

I have been following the tutorial notebook and here are the things I have tried so far after running into many errors:

set basic configs

cfg.data_root = 'data/myObjs' cfg.work_dir = 'work_dirs/hrnet_w32_coco_tiny_256x192' cfg.gpu_ids = range(1) cfg.seed = 0

set log interval

cfg.log_config.interval = 1

set evaluation configs

cfg.evaluation.interval = 10 cfg.evaluation.metric = 'PCK' cfg.evaluation.save_best = 'PCK'

set learning rate policy

lr_config = dict( policy='step', warmup='linear', warmup_iters=10, warmup_ratio=0.001, step=[17, 35]) cfg.total_epochs = 40

set batch size

cfg.data.samples_per_gpu = 16 cfg.data.val_dataloader = dict(samples_per_gpu=16) cfg.data.test_dataloader = dict(samples_per_gpu=16)

set dataset configs

cfg.data.train.type = 'TopDownCocoDataset' cfg.data.train.ann_file = f'{cfg.data_root}/test.json' cfg.data.train.img_prefix = f'{cfg.data_root}/images/'

cfg.data.val.type = 'TopDownCocoDataset' cfg.data.val.ann_file = f'{cfg.data_root}/test.json' cfg.data.val.img_prefix = f'{cfg.data_root}/images/'

cfg.data.test.type = 'TopDownCocoDataset' cfg.data.test.ann_file = f'{cfg.data_root}/test.json' cfg.data.test.img_prefix = f'{cfg.data_root}/images/'

print(cfg.pretty_text)


* The print from the above cell would still output a lot of information regarding human pose estimation where that was not desired and since the training didn't begin I thought it was at fault. So I traced it to `mmpose/configs/_base_/datasets/coco.py`. Here I edited all the information to match my scenario.

dataset_info = dict( dataset_name='coco', paper_info=dict( author='Lin, Tsung-Yi and Maire, Michael and ' 'Belongie, Serge and Hays, James and ' 'Perona, Pietro and Ramanan, Deva and ' r'Doll{\'a}r, Piotr and Zitnick, C Lawrence', title='Microsoft coco: Common objects in context', container='European conference on computer vision', year='2014', homepage='http://cocodataset.org/', ), keypoint_info={ 0: dict(name='top_left', id=0, color=[51, 153, 255], type='upper', swap='top_right'), 1: dict( name='top_right', id=1, color=[51, 153, 255], type='upper', swap='top_left'), 2: dict( name='bottom_left', id=2, color=[51, 153, 255], type='upper', swap='bottom_right'), 3: dict( name='bottom_right', id=3, color=[51, 153, 255], type='upper', swap='bottom_left') }, skeleton_info={ 0: dict(link=('top_left', 'top_right'), id=0, color=[0, 255, 0]), 1: dict(link=('top_left', 'bottom_left'), id=1, color=[0, 255, 0]), 2: dict(link=('top_right', 'bottom_right'), id=2, color=[255, 128, 0]), 3: dict(link=('bottom_left', 'bottom_right'), id=3, color=[255, 128, 0]) }, joint_weights=[ 1., 1., 1., 1. ], sigmas=[ 0.026, 0.025, 0.025, 0.035 ])


* So I finally see some output when I run the training cell but it gives an error again about this file `data/coco/person_detection_results/COCO_val2017_detections_AP_H_56_person.json` not being present. I now remember I saw it in the MD file which shows how to use a dataset of your own but in my scenario, I don't have this file for my dataset but I still download it anyway and run but the training doesn't progress and throws this error and stops

![image](https://user-images.githubusercontent.com/11333899/143453556-6d424284-e48e-4dd4-bf74-a21044c4fd78.png)

* Then I see that this file is used in the config file on some variable called `box_file`, I tried to comment that out and that didn't work either. 

So basically how can I just begin training for my use case and my dataset just by simply giving paths to my COCO formatted dataset which has all the information regarding skeletons, keypoints, segmentation, bounding_boxes, etc? 

I saw #981 but in that issue, the solution the OP has mentioned is very vague for me in the first step. He says to create a package and then create a class definition but what exactly do I write in there for my dataset to work and If I even need to do that since my dataset is already formatted in COCO.

Thank you so much for reading the lost post. I might be missing something fundamental here but I would really appreciate the help. 

Thanks

P.S. The reason I have one JSON file for the data named `test.json` and I pass the same for all three sets is that right now I am just trying to initialize the training process right as more data is being collected and labeled. 
jin-s13 commented 2 years ago

The problem is about data/coco/person_detection_results/COCO_val2017_detections_AP_H_56_person.json, this file is used for evaluation only. It provides detected bounding boxes, which can be used for keypoint evaluation. Since you do not have this, you may choose to use ground-truth bounding boxes for evaluation.

https://github.com/open-mmlab/mmpose/blob/4963fc4d46cf34a6e314a9f8ff6d1be241b042c5/configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco/hrnet_w32_coco_256x192.py#L99

Here, you can set use_gt_bbox=True.

AliButtar commented 2 years ago

Thank you for the feedback. The suggestion you mentioned worked and the model trained but I also had to comment out these two lines from the code above as it gave an error about PCK metric:

cfg.evaluation.metric = 'PCK'
cfg.evaluation.save_best = 'PCK'

Other than this I would like to ask if the process I am using to train the model on my own custom data is right? If you can confirm this then it would be great.

Also, the mmPose installation on the tutorial notebook seems to have some dependency issues. It is not running code for inference and gives this error

image

These are the current version Colab Installs

torch version: 1.10.0+cu111 True
torchvision version: 0.11.1+cu111
mmpose version: 0.20.0

I also tried torch==1.9.0+cu111 torchvision==0.10.0+cu111 but it didn't work still gave this error. This inference error is also present in one of the top cells where the notebook only does inference from a pretrained model before moving into training. Can you guide me on how to resolve this?

Thanks a lot.

AliButtar commented 2 years ago

One more question is that seeing the inference code in the tutorial notebook. Am I correctly assuming that first a detection model is being used to detect a person and then a pose model is used to perform inference for the pose.

So if have to correctly use this framework on my problem then I would have to first train a detector that detects my object in the photo and then detects the keypoints on that object?

jin-s13 commented 2 years ago
  1. I also had to comment out these two lines from the code above as it gave an error about PCK metric: Yes, for coco dataset, AP is used for evaluation.
  1. Other than this I would like to ask if the process I am using to train the model on my own custom data is right? I did not check very carefully, but it looks good. You may check if the loss decreases, and the accuracy increases.

  2. the mmPose installation on the tutorial notebook seems to have some dependency issues. @ly015 Could you please help check this problem? It seems that mmcv-full is not installed properly. Maybe you can uninstall and re-install the latest mmcv-full.

  3. So if have to correctly use this framework on my problem then I would have to first train a detector that detects my object in the photo and then detects the keypoints on that object? For top-down algorithms, yes. You may use mmdetection to train a detector. You can also try bottom-up approaches, e.g. Associative Embedding (AE), which does not rely on object detection.

AliButtar commented 2 years ago
  1. Understood

  2. Alright, Ill check that.

  3. Sorry, I know you tagged your colleague there but are you referring me to uninstall and reinstall mmcv-full? If yes then so far I have only tried this on colab notebooks where it takes almost 30 minutes to install. I'll try to check.

  4. I see, then would my COCO dataset be compatible with these bottom-up approaches? Ill check this myself but just asking if I should be aware of any other changes here.

Thanks again.

jin-s13 commented 2 years ago

are you referring me to uninstall and reinstall mmcv-full? Yes, please try again.

would my COCO dataset be compatible with these bottom-up approaches? Yes, please have a try.

AliButtar commented 2 years ago

I have tried the bottom-up approaches but unfortunately I just get the following error:

image

I used the following file: /content/mmpose/configs/body/2d_kpt_sview_rgb_img/associative_embedding/coco/higherhrnet_w32_coco_512x512.py

I made appropriate changes to this file. I changed the num_joints from 17 to 4 where they were being used. This is a change I made in TopDown config file as well to make it work. I am assuming that there is some parameter configuration I am missing out on here.

_base_ = ['../../../../_base_/datasets/coco.py']
log_level = 'INFO'
load_from = None
resume_from = None
dist_params = dict(backend='nccl')
workflow = [('train', 1)]
checkpoint_config = dict(interval=50)
evaluation = dict(interval=50, metric='mAP', save_best='AP')

optimizer = dict(
    type='Adam',
    lr=0.0015,
)
optimizer_config = dict(grad_clip=None)
# learning policy
lr_config = dict(
    policy='step',
    warmup='linear',
    warmup_iters=500,
    warmup_ratio=0.001,
    step=[200, 260])
total_epochs = 300
log_config = dict(
    interval=50,
    hooks=[
        dict(type='TextLoggerHook'),
        # dict(type='TensorboardLoggerHook')
    ])

channel_cfg = dict(
    dataset_joints=4,
    dataset_channel=[
        [0, 1, 2, 3],
    ],
    inference_channel=[
        0, 1, 2, 3
    ])

data_cfg = dict(
    image_size=512,
    base_size=256,
    base_sigma=2,
    heatmap_size=[128, 256],
    num_joints=channel_cfg['dataset_joints'],
    dataset_channel=channel_cfg['dataset_channel'],
    inference_channel=channel_cfg['inference_channel'],
    num_scales=2,
    scale_aware_sigma=False,
)

# model settings
model = dict(
    type='AssociativeEmbedding',
    pretrained='https://download.openmmlab.com/mmpose/'
    'pretrain_models/hrnet_w32-36af842e.pth',
    backbone=dict(
        type='HRNet',
        in_channels=3,
        extra=dict(
            stage1=dict(
                num_modules=1,
                num_branches=1,
                block='BOTTLENECK',
                num_blocks=(4, ),
                num_channels=(64, )),
            stage2=dict(
                num_modules=1,
                num_branches=2,
                block='BASIC',
                num_blocks=(4, 4),
                num_channels=(32, 64)),
            stage3=dict(
                num_modules=4,
                num_branches=3,
                block='BASIC',
                num_blocks=(4, 4, 4),
                num_channels=(32, 64, 128)),
            stage4=dict(
                num_modules=3,
                num_branches=4,
                block='BASIC',
                num_blocks=(4, 4, 4, 4),
                num_channels=(32, 64, 128, 256))),
    ),
    keypoint_head=dict(
        type='AEHigherResolutionHead',
        in_channels=32,
        num_joints=4,
        tag_per_joint=True,
        extra=dict(final_conv_kernel=1, ),
        num_deconv_layers=1,
        num_deconv_filters=[32],
        num_deconv_kernels=[4],
        num_basic_blocks=4,
        cat_output=[True],
        with_ae_loss=[True, False],
        loss_keypoint=dict(
            type='MultiLossFactory',
            num_joints=4,
            num_stages=2,
            ae_loss_type='exp',
            with_ae_loss=[True, False],
            push_loss_factor=[0.001, 0.001],
            pull_loss_factor=[0.001, 0.001],
            with_heatmaps_loss=[True, True],
            heatmaps_loss_factor=[1.0, 1.0])),
    train_cfg=dict(),
    test_cfg=dict(
        num_joints=channel_cfg['dataset_joints'],
        max_num_people=30,
        scale_factor=[1],
        with_heatmaps=[True, True],
        with_ae=[True, False],
        project2image=True,
        align_corners=False,
        nms_kernel=5,
        nms_padding=2,
        tag_per_joint=True,
        detection_threshold=0.1,
        tag_threshold=1,
        use_detection_val=True,
        ignore_too_much=False,
        adjust=True,
        refine=True,
        flip_test=True))

train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(
        type='BottomUpRandomAffine',
        rot_factor=30,
        scale_factor=[0.75, 1.5],
        scale_type='short',
        trans_factor=40),
    dict(type='BottomUpRandomFlip', flip_prob=0.5),
    dict(type='ToTensor'),
    dict(
        type='NormalizeTensor',
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225]),
    dict(
        type='BottomUpGenerateTarget',
        sigma=2,
        max_num_people=30,
    ),
    dict(
        type='Collect',
        keys=['img', 'joints', 'targets', 'masks'],
        meta_keys=[]),
]

val_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='BottomUpGetImgSize', test_scale_factor=[1]),
    dict(
        type='BottomUpResizeAlign',
        transforms=[
            dict(type='ToTensor'),
            dict(
                type='NormalizeTensor',
                mean=[0.485, 0.456, 0.406],
                std=[0.229, 0.224, 0.225]),
        ]),
    dict(
        type='Collect',
        keys=['img'],
        meta_keys=[
            'image_file', 'aug_data', 'test_scale_factor', 'base_size',
            'center', 'scale', 'flip_index'
        ]),
]

test_pipeline = val_pipeline

data_root = 'data/coco'
data = dict(
    workers_per_gpu=2,
    train_dataloader=dict(samples_per_gpu=24),
    val_dataloader=dict(samples_per_gpu=1),
    test_dataloader=dict(samples_per_gpu=1),
    train=dict(
        type='BottomUpCocoDataset',
        ann_file=f'{data_root}/annotations/person_keypoints_train2017.json',
        img_prefix=f'{data_root}/train2017/',
        data_cfg=data_cfg,
        pipeline=train_pipeline,
        dataset_info={{_base_.dataset_info}}),
    val=dict(
        type='BottomUpCocoDataset',
        ann_file=f'{data_root}/annotations/person_keypoints_val2017.json',
        img_prefix=f'{data_root}/val2017/',
        data_cfg=data_cfg,
        pipeline=val_pipeline,
        dataset_info={{_base_.dataset_info}}),
    test=dict(
        type='BottomUpCocoDataset',
        ann_file=f'{data_root}/annotations/person_keypoints_val2017.json',
        img_prefix=f'{data_root}/val2017/',
        data_cfg=data_cfg,
        pipeline=test_pipeline,
        dataset_info={{_base_.dataset_info}}),
)

Please have a look and let me know what I might be not be changing appropriately for my dataset.

jin-s13 commented 2 years ago

_base_ = ['../../../../_base_/datasets/coco.py'] This should be replaced with your own 4-kpt dataset

AliButtar commented 2 years ago

Apologies, my bad. It worked I was loading the wrong configuration file, higherhrnet instead of hrnet where I was making the changes.

The training started but I am now getting this error for it when it tries to save the model.

image

AssertionError                            Traceback (most recent call last)
<ipython-input-18-1b2b2ac6a433> in <module>()
     15 # train model
     16 train_model(
---> 17     model, datasets, cfg, distributed=False, validate=True, meta=dict())

13 frames
/content/mmpose/mmpose/apis/train.py in train_model(model, dataset, cfg, distributed, validate, timestamp, meta)
    154     elif cfg.load_from:
    155         runner.load_checkpoint(cfg.load_from)
--> 156     runner.run(data_loaders, cfg.workflow, cfg.total_epochs)

/usr/local/lib/python3.7/dist-packages/mmcv/runner/epoch_based_runner.py in run(self, data_loaders, workflow, max_epochs, **kwargs)
    125                     if mode == 'train' and self.epoch >= self._max_epochs:
    126                         break
--> 127                     epoch_runner(data_loaders[i], **kwargs)
    128 
    129         time.sleep(1)  # wait for some hooks like loggers to finish

/usr/local/lib/python3.7/dist-packages/mmcv/runner/epoch_based_runner.py in train(self, data_loader, **kwargs)
     52             self._iter += 1
     53 
---> 54         self.call_hook('after_train_epoch')
     55         self._epoch += 1
     56 

/usr/local/lib/python3.7/dist-packages/mmcv/runner/base_runner.py in call_hook(self, fn_name)
    305         """
    306         for hook in self._hooks:
--> 307             getattr(hook, fn_name)(self)
    308 
    309     def get_hook_info(self):

/usr/local/lib/python3.7/dist-packages/mmcv/runner/hooks/evaluation.py in after_train_epoch(self, runner)
    265         """Called after every training epoch to evaluate the results."""
    266         if self.by_epoch and self._should_evaluate(runner):
--> 267             self._do_evaluate(runner)
    268 
    269     def _do_evaluate(self, runner):

/usr/local/lib/python3.7/dist-packages/mmcv/runner/hooks/evaluation.py in _do_evaluate(self, runner)
    269     def _do_evaluate(self, runner):
    270         """perform evaluation and save ckpt."""
--> 271         results = self.test_fn(runner.model, self.dataloader)
    272         runner.log_buffer.output['eval_iter_num'] = len(self.dataloader)
    273         key_score = self.evaluate(runner, results)

/content/mmpose/mmpose/apis/test.py in single_gpu_test(model, data_loader)
     31     for data in data_loader:
     32         with torch.no_grad():
---> 33             result = model(return_loss=False, **data)
     34         results.append(result)
     35 

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1100         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102             return forward_call(*input, **kwargs)
   1103         # Do not call functions when jit is used
   1104         full_backward_hooks, non_full_backward_hooks = [], []

/usr/local/lib/python3.7/dist-packages/mmcv/parallel/data_parallel.py in forward(self, *inputs, **kwargs)
     48             return self.module(*inputs[0], **kwargs[0])
     49         else:
---> 50             return super().forward(*inputs, **kwargs)
     51 
     52     def scatter(self, inputs, kwargs, device_ids):

/usr/local/lib/python3.7/dist-packages/torch/nn/parallel/data_parallel.py in forward(self, *inputs, **kwargs)
    164 
    165             if len(self.device_ids) == 1:
--> 166                 return self.module(*inputs[0], **kwargs[0])
    167             replicas = self.replicate(self.module, self.device_ids[:len(inputs)])
    168             outputs = self.parallel_apply(replicas, inputs, kwargs)

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1100         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102             return forward_call(*input, **kwargs)
   1103         # Do not call functions when jit is used
   1104         full_backward_hooks, non_full_backward_hooks = [], []

/usr/local/lib/python3.7/dist-packages/mmcv/runner/fp16_utils.py in new_func(*args, **kwargs)
     96                                 'method of nn.Module')
     97             if not (hasattr(args[0], 'fp16_enabled') and args[0].fp16_enabled):
---> 98                 return old_func(*args, **kwargs)
     99 
    100             # get the arg spec of the decorated method

/content/mmpose/mmpose/models/detectors/associative_embedding.py in forward(self, img, targets, masks, joints, img_metas, return_loss, return_heatmap, **kwargs)
    131                                       **kwargs)
    132         return self.forward_test(
--> 133             img, img_metas, return_heatmap=return_heatmap, **kwargs)
    134 
    135     def forward_train(self, img, targets, masks, joints, img_metas, **kwargs):

/content/mmpose/mmpose/models/detectors/associative_embedding.py in forward_test(self, img, img_metas, return_heatmap, **kwargs)
    214             scale (np.ndarray): the scale of image
    215         """
--> 216         assert img.size(0) == 1
    217         assert len(img_metas) == 1
    218 

AssertionError: 

Thanks

jin-s13 commented 2 years ago

val_dataloader=dict(samples_per_gpu=1),

For bottom-up models, during evaluation, only batchsize=1 is supported.

AliButtar commented 2 years ago

@jin-s13 Thanks a lot, the training ran completely without any errors after that change.

On another note, I tried reinstalling mmcv-full simply with pip uninstall mmcv-full and then by running pip install mmcv-full but the same error I mentioned earlier is still there.

haichaoyu commented 2 years ago

Hi @jin-s13 , just to make sure I understand it correctly, to train a model on a new dataset with, e.g., 4 key-points in COCO format, one can just 1) convert the data/annotation into coco format, 2) modify the data config file (https://github.com/open-mmlab/mmpose/blob/master/configs/body/2d_kpt_sview_rgb_img/associative_embedding/coco/hrnet_w32_coco_512x512_udp.py#L1) to support 4 key-points, and 3) modify the model output channel number? All the other training and evaluation code will then be compatible with the new dataset.

jin-s13 commented 2 years ago

Also need to modify the dataset info file https://github.com/open-mmlab/mmpose/blob/master/configs/_base_/datasets/coco.py

xypu98 commented 2 years ago

I have tried the bottom-up approaches but unfortunately I just get the following error:

image

I used the following file: /content/mmpose/configs/body/2d_kpt_sview_rgb_img/associative_embedding/coco/higherhrnet_w32_coco_512x512.py

I made appropriate changes to this file. I changed the num_joints from 17 to 4 where they were being used. This is a change I made in TopDown config file as well to make it work. I am assuming that there is some parameter configuration I am missing out on here.

_base_ = ['../../../../_base_/datasets/coco.py']
log_level = 'INFO'
load_from = None
resume_from = None
dist_params = dict(backend='nccl')
workflow = [('train', 1)]
checkpoint_config = dict(interval=50)
evaluation = dict(interval=50, metric='mAP', save_best='AP')

optimizer = dict(
    type='Adam',
    lr=0.0015,
)
optimizer_config = dict(grad_clip=None)
# learning policy
lr_config = dict(
    policy='step',
    warmup='linear',
    warmup_iters=500,
    warmup_ratio=0.001,
    step=[200, 260])
total_epochs = 300
log_config = dict(
    interval=50,
    hooks=[
        dict(type='TextLoggerHook'),
        # dict(type='TensorboardLoggerHook')
    ])

channel_cfg = dict(
    dataset_joints=4,
    dataset_channel=[
        [0, 1, 2, 3],
    ],
    inference_channel=[
        0, 1, 2, 3
    ])

data_cfg = dict(
    image_size=512,
    base_size=256,
    base_sigma=2,
    heatmap_size=[128, 256],
    num_joints=channel_cfg['dataset_joints'],
    dataset_channel=channel_cfg['dataset_channel'],
    inference_channel=channel_cfg['inference_channel'],
    num_scales=2,
    scale_aware_sigma=False,
)

# model settings
model = dict(
    type='AssociativeEmbedding',
    pretrained='https://download.openmmlab.com/mmpose/'
    'pretrain_models/hrnet_w32-36af842e.pth',
    backbone=dict(
        type='HRNet',
        in_channels=3,
        extra=dict(
            stage1=dict(
                num_modules=1,
                num_branches=1,
                block='BOTTLENECK',
                num_blocks=(4, ),
                num_channels=(64, )),
            stage2=dict(
                num_modules=1,
                num_branches=2,
                block='BASIC',
                num_blocks=(4, 4),
                num_channels=(32, 64)),
            stage3=dict(
                num_modules=4,
                num_branches=3,
                block='BASIC',
                num_blocks=(4, 4, 4),
                num_channels=(32, 64, 128)),
            stage4=dict(
                num_modules=3,
                num_branches=4,
                block='BASIC',
                num_blocks=(4, 4, 4, 4),
                num_channels=(32, 64, 128, 256))),
    ),
    keypoint_head=dict(
        type='AEHigherResolutionHead',
        in_channels=32,
        num_joints=4,
        tag_per_joint=True,
        extra=dict(final_conv_kernel=1, ),
        num_deconv_layers=1,
        num_deconv_filters=[32],
        num_deconv_kernels=[4],
        num_basic_blocks=4,
        cat_output=[True],
        with_ae_loss=[True, False],
        loss_keypoint=dict(
            type='MultiLossFactory',
            num_joints=4,
            num_stages=2,
            ae_loss_type='exp',
            with_ae_loss=[True, False],
            push_loss_factor=[0.001, 0.001],
            pull_loss_factor=[0.001, 0.001],
            with_heatmaps_loss=[True, True],
            heatmaps_loss_factor=[1.0, 1.0])),
    train_cfg=dict(),
    test_cfg=dict(
        num_joints=channel_cfg['dataset_joints'],
        max_num_people=30,
        scale_factor=[1],
        with_heatmaps=[True, True],
        with_ae=[True, False],
        project2image=True,
        align_corners=False,
        nms_kernel=5,
        nms_padding=2,
        tag_per_joint=True,
        detection_threshold=0.1,
        tag_threshold=1,
        use_detection_val=True,
        ignore_too_much=False,
        adjust=True,
        refine=True,
        flip_test=True))

train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(
        type='BottomUpRandomAffine',
        rot_factor=30,
        scale_factor=[0.75, 1.5],
        scale_type='short',
        trans_factor=40),
    dict(type='BottomUpRandomFlip', flip_prob=0.5),
    dict(type='ToTensor'),
    dict(
        type='NormalizeTensor',
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225]),
    dict(
        type='BottomUpGenerateTarget',
        sigma=2,
        max_num_people=30,
    ),
    dict(
        type='Collect',
        keys=['img', 'joints', 'targets', 'masks'],
        meta_keys=[]),
]

val_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='BottomUpGetImgSize', test_scale_factor=[1]),
    dict(
        type='BottomUpResizeAlign',
        transforms=[
            dict(type='ToTensor'),
            dict(
                type='NormalizeTensor',
                mean=[0.485, 0.456, 0.406],
                std=[0.229, 0.224, 0.225]),
        ]),
    dict(
        type='Collect',
        keys=['img'],
        meta_keys=[
            'image_file', 'aug_data', 'test_scale_factor', 'base_size',
            'center', 'scale', 'flip_index'
        ]),
]

test_pipeline = val_pipeline

data_root = 'data/coco'
data = dict(
    workers_per_gpu=2,
    train_dataloader=dict(samples_per_gpu=24),
    val_dataloader=dict(samples_per_gpu=1),
    test_dataloader=dict(samples_per_gpu=1),
    train=dict(
        type='BottomUpCocoDataset',
        ann_file=f'{data_root}/annotations/person_keypoints_train2017.json',
        img_prefix=f'{data_root}/train2017/',
        data_cfg=data_cfg,
        pipeline=train_pipeline,
        dataset_info={{_base_.dataset_info}}),
    val=dict(
        type='BottomUpCocoDataset',
        ann_file=f'{data_root}/annotations/person_keypoints_val2017.json',
        img_prefix=f'{data_root}/val2017/',
        data_cfg=data_cfg,
        pipeline=val_pipeline,
        dataset_info={{_base_.dataset_info}}),
    test=dict(
        type='BottomUpCocoDataset',
        ann_file=f'{data_root}/annotations/person_keypoints_val2017.json',
        img_prefix=f'{data_root}/val2017/',
        data_cfg=data_cfg,
        pipeline=test_pipeline,
        dataset_info={{_base_.dataset_info}}),
)

Please have a look and let me know what I might be not be changing appropriately for my dataset.

I have the same error! I noticed that you may resolve this problem by modifying the base = ['../../../../base/datasets/coco.py']. And I also modify this .py to 6 key-points (my dataset has 6 key points).But the error also occurs.Is there any probelm?

my coco.py

dataset_info = dict( dataset_name='coco', paper_info=dict( author='Lin, Tsung-Yi and Maire, Michael and ' 'Belongie, Serge and Hays, James and ' 'Perona, Pietro and Ramanan, Deva and ' r'Doll{\'a}r, Piotr and Zitnick, C Lawrence', title='Microsoft coco: Common objects in context', container='European conference on computer vision', year='2014', homepage='http://cocodataset.org/', ), keypoint_info={ 0: dict(name='left_top', id=1, color=[51, 153, 255], type='upper', swap='right_top'), 1: dict( name='right_top', id=2, color=[51, 153, 255], type='upper', swap='left_top'), 2: dict( name='right_bottom', id=3, color=[51, 153, 255], type='lower', swap='left_bottom'), 3: dict( name='left_bottom', id=4, color=[51, 153, 255], type='lower', swap='right_bottom'), 4: dict( name='center', id=5, color=[0, 255, 0], type='upper', swap=''), 5: dict( name='head', id=6, color=[0, 255, 0], type='upper', swap='')

},
skeleton_info={
    0:
    dict(link=('left_top', 'right_top'), id=0, color=[0, 255, 0]),
    1:
    dict(link=('right_top', 'right_bottom'), id=1, color=[0, 255, 0]),
    2:
    dict(link=('right_bottom', 'left_bottom'), id=2, color=[0, 255, 0]),
    3:
    dict(link=('left_bottom', 'left_top'), id=3, color=[0, 255, 0]),
    4:
    dict(link=('center', 'head'), id=4, color=[51, 153, 255]) 
},
joint_weights=[
    1., 1., 1., 1., 1., 1.
],
sigmas=[
    0.026, 0.025, 0.025, 0.035, 0.035, 0.035
])
alaa-shubbak commented 1 year ago

during reading on the config of different model , i noticed that there are 4 parameters:

  1. num_output_channels
  2. dataset_joints
  3. dataset_channel
  4. inference_channel

i noticed that in the config of coco dataset (human keypoints) , those parameter are setting to 17 while for example for animal_pose it is 20 .

those changes are applied on the following part/code within the config:

channel_cfg = dict(
    num_output_channels=20,
    dataset_joints=20,
    dataset_channel=[
        [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
    ],
    inference_channel=[
        0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19
    ])

I think it is necessary to do such changing according to num of keypoints i have in my dataset ,

my questions are : first : shall i do them all for all types of models, for example top-down model structures as well as bottom-up models ? second : in case i form my dataset and annotation to be as coco dataset but the objects that i am creating the keypoints for it was not human. do all types of model work? or this could case an error?