Closed greg-is-kub closed 3 years ago
Hi, I think maybe you could visualize some of the top-down results, especially those with large errors. It may help to locate the problem.
What is the size of your dataset (number of images)?
If your dataset is relatively small (only a few hundred), some modifications are needed.
the warmup iteration should be smaller. https://github.com/open-mmlab/mmpose/blob/202983d24665a909ae1c45f4025d66794b9e32fd/configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco/hrnet_w32_coco_256x192.py#L18
increase the number of total epochs (and accordingly increase the lr step). https://github.com/open-mmlab/mmpose/blob/master/configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco/hrnet_w32_coco_256x192.py#L20-L21
It may help to use COCO-pretrained model to initialize the model. Replace 'None' with the model link. https://github.com/open-mmlab/mmpose/blob/202983d24665a909ae1c45f4025d66794b9e32fd/configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco/hrnet_w32_coco_256x192.py#L2
@jin-s13 I used pretrained models (dowloaded from mmpose read the docs) without further training on a VERY small test database, meaning 20 pictures maximum .
I specify the checkpoint file path in the command line , does it makes a difference ?
./tools/dist_test.sh configs/my_config/medium_office_dataset/higher_hrnet48_coco_512x512.py checkpoints/a_tester/higher_hrnet48_coco_512x512_ae-60fedcbc_20200712.pth 1 --out benchmark_result/medium_office_dataset/higher_hrnet_w48_512x512.json
EDIT : I did it your way by using and obtained the same results.
python tools/test.py configs/my_config/medium_hospital_dataset/res152_coco_256x192_dark.py https://download.openmmlab.com/mmpose/top_down/resnet/res152_coco_256x192_dark-ab4840d5_20200812.pth --out benchmark_result/medium_hospital_dataset/resnet152_dark_256x192.json
Here is the config file i used
log_level = 'INFO'
load_from = 'https://download.openmmlab.com/mmpose/top_down/resnet/res152_coco_256x192_dark-ab4840d5_20200812.pth'
resume_from = None
dist_params = dict(backend='nccl')
workflow = [('train', 1)]
checkpoint_config = dict(interval=10)
evaluation = dict(interval=10, metric='mAP', key_indicator='AP')
optimizer = dict(
type='Adam',
lr=5e-4,
)
optimizer_config = dict(grad_clip=None)
# learning policy
lr_config = dict(
policy='step',
warmup='linear',
warmup_iters=500,
warmup_ratio=0.001,
step=[170, 200])
total_epochs = 210
log_config = dict(
interval=50,
hooks=[
dict(type='TextLoggerHook'),
# dict(type='TensorboardLoggerHook')
])
channel_cfg = dict(
num_output_channels=17,
dataset_joints=17,
dataset_channel=[
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16],
],
inference_channel=[
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16
])
# model settings
model = dict(
type='TopDown',
pretrained='torchvision://resnet152',
backbone=dict(type='ResNet', depth=152),
keypoint_head=dict(
type='TopDownSimpleHead',
in_channels=2048,
out_channels=channel_cfg['num_output_channels'],
loss_keypoint=dict(type='JointsMSELoss', use_target_weight=True)),
train_cfg=dict(),
test_cfg=dict(
flip_test=True,
post_process='unbiased',
shift_heatmap=True,
modulate_kernel=11))
data_cfg = dict(
image_size=[192, 256],
heatmap_size=[48, 64],
num_output_channels=channel_cfg['num_output_channels'],
num_joints=channel_cfg['dataset_joints'],
dataset_channel=channel_cfg['dataset_channel'],
inference_channel=channel_cfg['inference_channel'],
soft_nms=False,
nms_thr=1.0,
oks_thr=0.9,
vis_thr=0.2,
use_gt_bbox=False,
det_bbox_thr=0.0,
bbox_file='data/custom/medium_hospital_dataset/256x192/detection_result/'
'medium_hospital_dataset_256x192.json',
)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='TopDownRandomFlip', flip_prob=0.5),
dict(
type='TopDownHalfBodyTransform',
num_joints_half_body=8,
prob_half_body=0.3),
dict(
type='TopDownGetRandomScaleRotation', rot_factor=40, scale_factor=0.5),
dict(type='TopDownAffine'),
dict(type='ToTensor'),
dict(
type='NormalizeTensor',
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),
dict(type='TopDownGenerateTarget', sigma=2, unbiased_encoding=True),
dict(
type='Collect',
keys=['img', 'target', 'target_weight'],
meta_keys=[
'image_file', 'joints_3d', 'joints_3d_visible', 'center', 'scale',
'rotation', 'bbox_score', 'flip_pairs'
]),
]
val_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='TopDownAffine'),
dict(type='ToTensor'),
dict(
type='NormalizeTensor',
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),
dict(
type='Collect',
keys=['img'],
meta_keys=[
'image_file', 'center', 'scale', 'rotation', 'bbox_score',
'flip_pairs'
]),
]
test_pipeline = val_pipeline
data_root = 'data/custom/medium_hospital_dataset/256x192'
data = dict(
samples_per_gpu=32,
workers_per_gpu=4,
train=dict(
type='TopDownCocoDataset',
ann_file=f'{data_root}/annotation/medium_hospital_dataset_256x192.json',
img_prefix=f'{data_root}/data/',
data_cfg=data_cfg,
pipeline=train_pipeline),
val=dict(
type='TopDownCocoDataset',
ann_file=f'{data_root}/annotation/medium_hospital_dataset_256x192.json',
img_prefix=f'{data_root}/data/',
data_cfg=data_cfg,
pipeline=val_pipeline),
test=dict(
type='TopDownCocoDataset',
ann_file=f'{data_root}/annotation/medium_hospital_dataset_256x192.json',
img_prefix=f'{data_root}/data/',
data_cfg=data_cfg,
pipeline=val_pipeline),
)
EDIT : I applied a rotation of 1 degree and then 10 deg to my test datawithout changing the annotation and tests showed relatively better results (AP of .38 on hospital dataset w/ darkpose res152 256x192 , it seems that the problem comes from my dataset
@ly015 Most of the errors come from the fact that there are a lot of artifact on the pictures, folded blankets or intubation devices are mistaken as joints or limbs. I have no clue of how I could remove them.
The whole point of the project is to make it work with these constraints.
The error came from the fact that I was resizing the picture but you already had preprocessing in your pipeline.
With the original images the error were corrected and i now have better accuracy
hi !
I'm opening an issue because I have a problem with the fact that every top-down methods that have a lower precision than bottom up on my own dataset . While usually you should get better results with top-down-methods, I get better results on Bottom-up method.
Here are the result of various tests I did with various datasets : https://docs.google.com/spreadsheets/d/1VwA9OIKHJP8EzJRWCUb1TzbbaCG-GnJHqsd37OPfjnQ/edit#gid=0
Every specified model used are the original config file where i just change
data
anddata_cfg
dictionnaries to match the data , annotation and bounding box file path.I used the coco annotator tool to create my annotations. I take the json file that will be my annotation. Then make a copy of
annotation_json["annotation"]
that will be my the box detection result ( + a false 1.0 box confidence score i add bc it is need by the code) .{"id": 387, "image_id": 211, "category_id": 1, "dataset_id": 15, "segmentation": [[20.9, 256, 20.9, -0.6, 171.8, -0.6, 171.8, 256]], "area": 38656, "bbox": [21, 0, 151, 256], "iscrowd": false, "isbbox": true, "creator": "greg_is_greg", "width": 192, "height": 256, "color": "#4085ec", "keypoints": [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 134, 48, 2, 89, 34, 2, 152, 130, 2, 72, 123, 2, 153, 200, 2, 51, 186, 2, 122, 192, 2, 89, 188, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], "metadata": {}, "milliseconds": 41217, "events": [{"_cls": "SessionEvent", "created_at": {"$date": 1623937046948}, "user": "greg_is_greg", "milliseconds": 22771, "tools_used": ["BBox", "Keypoints"]}, {"_cls": "SessionEvent", "created_at": {"$date": 1623938782823}, "user": "greg_is_greg", "milliseconds": 9223, "tools_used": ["BBox"]}], "num_keypoints": 8, "score": 1.0}
here is what medium office dataset typical data looks like (with visible annotations:
here is an example for medium hospital dataset :
Regarding the challenging images with a lot of occlusion in the hospital dataset i kinda expected my results to be low but the fact that they are lower in TopDown detector than in BottomUp for Hospital AND Office dataset makes me think I did wrong somewhere.
Do you have any idea of where I might have made a mistake ?
Thanks in advance for your help.