open-mmlab / mmpose

OpenMMLab Pose Estimation Toolbox and Benchmark.
https://mmpose.readthedocs.io/en/latest/
Apache License 2.0
5.4k stars 1.18k forks source link

Using bbox during testing #2005

Closed LjIA26 closed 1 year ago

LjIA26 commented 1 year ago

Hello,

Upon testing, it seems like the algorithm is not taking the bounding boxes from the JSON file during testing because when I saved the images the bounding boxes appear all over the place. Is there a way to fix this ?

How can I make the bounding box be the only area where the prediction happens? I am getting predictions all over the place outside the bounding box. During training, the visualization of the transforms let me know that the bounding box is where the training happens and not outside of this area.

Thank you for your time.

LjIA26 commented 1 year ago

I am sorry if I wasn't very clear with the question, I just edited it.

Tau-J commented 1 year ago

Hi, thanks for using MMPose. For evaluation, you can set bbox_file in val_dataloader and test_dataloader. For prediction, you should use a detector first to provide bboxes. We provide a demo script demo/topdown_demo_with_mmdet.py so that you can conduct the pipeline inference.

LjIA26 commented 1 year ago

Yes, it is for evaluation of my testing set. Is the tool/test.py still in development then?

LjIA26 commented 1 year ago

This is what I used in the command line

python tools/test.py configs/plantsv5.py train-F24/epoch_50.pth --work-dir test-F24/ --show-dir test-F24/

And this is the information in the config file

test_dataloader =  dict(
    batch_size=10,
    num_workers=2,
    persistent_workers=True,
    drop_last=False,
    sampler=dict(type='DefaultSampler', shuffle=False, round_up=False),
    dataset=dict(
        type=dataset_type,
        data_root=data_root,
        data_mode=data_mode,
        metainfo=dict(from_file='configs/_base_/datasets/plantsv5.py'),
        ann_file='annotations/convertedv2.json',
        bbox_file='data/buildings/person_detection_results/bbox.json',
        data_prefix=dict(img='images-1/'),
        test_mode=True,
        pipeline=test_pipeline
    ))
Tau-J commented 1 year ago

Sorry I can't reproduce the error you encountered. Would you mind providing a visualized image that tools/test.py saved?

LjIA26 commented 1 year ago

image

This is an example of the types of results I get.

The bounding box which comes from the ground truth file doesn't fit the whole image and the prediction is applied to the whole image.

Also, I don't have access to my pc at the moment it's very late over here. Thank you for your comprehension.

LjIA26 commented 1 year ago

Having the prediction only inside the bounding box is very important in my case as I have two classes. One where only the prediction inside the bbox is required.

Tau-J commented 1 year ago

In MMPose, all predicted keypoints will be inside bboxes.

LjIA26 commented 1 year ago

Yes, that's the problem. It's not working and the boxes are sized in odd shapes, very different to their ground truths. When I use the visualization tool, in transformed mode, I can see that the bbox are correct.

Tau-J commented 1 year ago

Sorry, I can't locate the error with the information you provided, nor can I reproduce your problem on the coco dataset. Maybe when you have access to your computer you can share a real image to help me understand.

Also, in the test phase, MMPose will directly use the bbox provided by bbox_file for prediction, and should not augment the data, so you should not use the transformed mode to check whether the bbox is correct or not.

LjIA26 commented 1 year ago

Sorry! turns out those images with the weird bboxes had the w and h inverted. Very sorry about that!

However, there seems to be an offset for the prediction outside the bounding box, which in my case does matter. Is there a way to edit that offset?

Again, sorry for the previous confusion.

Tau-J commented 1 year ago

If you are using top-down heatmap, there should not be offsets, all predicted results should be inside bboxes. More specifically, the predicted coordinates are from Argmax, whose outputs must be within the bboxes.

LjIA26 commented 1 year ago

Hello,

yes, I am using top down. Here is an example. As you can see the prediction of the keypoints is outside of the bounding box. The bounding box in the bottom half of the image.

PIC225G79_0

LjIA26 commented 1 year ago

This is the complete config:

model = dict(
    type='TopdownPoseEstimator',
    data_preprocessor=dict(
        type='PoseDataPreprocessor',
        mean=[123.675, 116.28, 103.53],
        std=[58.395, 57.12, 57.375],
        bgr_to_rgb=True),
    backbone=dict(
        type='HRNet',
        in_channels=3,
        extra=dict(
            stage1=dict(
                num_modules=1,
                num_branches=1,
                block='BOTTLENECK',
                num_blocks=(4, ),
                num_channels=(64, )),
            stage2=dict(
                num_modules=1,
                num_branches=2,
                block='BASIC',
                num_blocks=(4, 4),
                num_channels=(32, 64)),
            stage3=dict(
                num_modules=4,
                num_branches=3,
                block='BASIC',
                num_blocks=(4, 4, 4),
                num_channels=(32, 64, 128)),
            stage4=dict(
                num_modules=3,
                num_branches=4,
                block='BASIC',
                num_blocks=(4, 4, 4, 4),
                num_channels=(32, 64, 128, 256))),
        init_cfg=dict(
        type='Pretrained',
            checkpoint=
            'https://download.openmmlab.com/mmpose/pretrain_models/hrnet_w48-8ef0771d.pth'
        )
        ),
    head=dict(
        type='HeatmapHead',
        in_channels=32,
        out_channels=100,
        deconv_out_channels=None,
        loss=dict(type='KeypointMSELoss', use_target_weight=True),
        decoder=dict(
            type='MSRAHeatmap',
            input_size=(256, 256),
            heatmap_size=(64, 64),
            sigma=2)),
    test_cfg=dict(
        flip_test=False,
        shift_heatmap=False,
        use_gt_bbox=True,
        output_heatmaps=True))
dataset_type = 'CocoDataset'
data_mode = 'topdown'
data_root = 'data/plants/'
train_pipeline = [
    dict(type='LoadImage', file_client_args=dict(backend='disk')),
    dict(type='GetBBoxCenterScale', padding=2),
    dict(type='TopdownAffine', input_size=(256, 256)),
    dict(
        type='GenerateTarget',
        target_type='heatmap',
        encoder=dict(
            type='MSRAHeatmap',
            input_size=(256, 256),
            heatmap_size=(64, 64),
            sigma=2)),
    dict(type='PackPoseInputs')
]
val_pipeline = [
    dict(type='LoadImage', file_client_args=dict(backend='disk')),
    dict(type='GetBBoxCenterScale'),
    dict(type='TopdownAffine', input_size=(256, 256)),
    dict(type='PackPoseInputs')
]
test_pipeline = [
    dict(type='LoadImage', file_client_args=dict(backend='disk')),
    dict(type='GetBBoxCenterScale'),
    dict(type='TopdownAffine', input_size=(256, 256)),
    dict(type='PackPoseInputs')
]
train_dataloader = dict(
    batch_size=24,
    num_workers=2,
    persistent_workers=True,
    sampler=dict(type='DefaultSampler', shuffle=True),
    dataset=dict(
        type='CocoDataset',
        data_root='data/plants/',
        data_mode='topdown',
        metainfo=dict(from_file='configs/_base_/datasets/plants6.py'),
        ann_file='annotations/Fullsize-3.json',
        data_prefix=dict(img='images-1/'),
        pipeline=[
            dict(type='LoadImage', file_client_args=dict(backend='disk')),
            dict(type='GetBBoxCenterScale', padding=2),
            dict(type='TopdownAffine', input_size=(256, 256)),
            dict(
                type='GenerateTarget',
                target_type='heatmap',
                encoder=dict(
                    type='MSRAHeatmap',
                    input_size=(256, 256),
                    heatmap_size=(64, 64),
                    sigma=2)),
            dict(type='PackPoseInputs')
        ]))
val_dataloader = dict(
    batch_size=32,
    num_workers=2,
    persistent_workers=True,
    drop_last=False,
    sampler=dict(type='DefaultSampler', shuffle=False, round_up=False),
    dataset=dict(
        type='CocoDataset',
        data_root='data/plants/',
        data_mode='topdown',
        metainfo=dict(from_file='configs/_base_/datasets/plants6.py'),
        ann_file='annotations/Fullsize-3-val.json',
        data_prefix=dict(img='validation-data/'),
        test_mode=True,
        pipeline=[
            dict(type='LoadImage', file_client_args=dict(backend='disk')),
            dict(type='GetBBoxCenterScale'),
            dict(type='TopdownAffine', input_size=(256, 256)),
            dict(type='PackPoseInputs')
        ]))
test_dataloader = val_dataloader
val_evaluator = [
    dict(type='PCKAccuracy', thr=0.05),
    dict(
        type='CocoMetric',
        ann_file='data/plants/annotations/Fullsize-3-val.json')
LjIA26 commented 1 year ago

So there is no way to solve it ? @Tau-J @ly015

Tau-J commented 1 year ago

Sorry for late reply, I still can't reproduce your problem on the coco dataset. If you have solved it, please share me some clues.

LjIA26 commented 1 year ago

Hello, I have not solved it. Sorry I haven't been able to work on my research for 2 months. But now I am back.