Tau-J / rtmlib

RTMPose series (RTMPose, DWPose, RTMO, RTMW) without mmcv, mmpose, mmdet etc.
Apache License 2.0
200 stars 24 forks source link

About bounding box (minimal tight bbx?) #30

Closed zehongs closed 2 months ago

zehongs commented 2 months ago

Thanks for the great work! This library is incredibly clean compared to mmpose.

I have a question about the definition of the bounding box. It doesn’t always appear to be the minimal bounding box like in YoloV8. Is there a specific interpretation or convention regarding the size of the bounding box? I would appreciate any clarification from the author.

I've attached two images as examples.

image
Tau-J commented 2 months ago

Hi @zehongs , as a convention, we will expand the minimal bbox (usually calculated via keypoints) with a factor 0.25 to ensure the whole instance is inside the bbox.

zehongs commented 2 months ago

Hi, thanks for the reply! Actually, this is a minor quesetion: Do you have any comments on the bounding boxes I show in the examples? They are the outputs (xyxy-format) of the rtmdet (https://download.openmmlab.com/mmpose/v1/projects/rtmposev1/onnx_sdk/yolox_m_8xb8-300e_humanart-c2c7a14a.zip). It's clear that they have different bahavoirs, i.e. left=minimal tight, right=somewhat expanded.

I'm not sure if this comes from some algorithm in the onnx model, or this is related to bounding box annotation in the training data.

Tau-J commented 2 months ago

I'm sure there is no such kind of processing in onnx model. The expansion only happens during the input stage of pose estimation.(For mmpose, the bbox annotations will be expanded in the augmentation pipeline). So, if the bbox you show is directly from onnx model of detection, I think it is just a model performance issue. (TBH, I think even the right one is good enough in most circumstances.)

zehongs commented 2 months ago

I see. Thank you!