YueLiao / CDN

Code for "Mining the Benefits of Two-stage and One-stage HOI Detection"
Apache License 2.0
89 stars 15 forks source link

Two questions about the detail #16

Closed Devotedyang closed 2 years ago

Devotedyang commented 2 years ago

Hello, thanks for your research! I have two questions:

  1. CDN give 100 prediction HOI triplets on every image of V-COCO, and then apply PNMS on the top-100 result. After checking your code and I find you don't set threshold to filter the final result. So the final number of predicting HOI triplets may be 40 or 50 at least, but the ground-truth for one image hava no more than 10 HOI triplets. How do you deal with this?

  2. In your paper, you do some useful research about the Human-object Pair Generation (use the HO-pair decoder to replace the Faster-RCNN in iCAN baseline). Recently I'm also want to do the same expriment, but I'm confused about the details. In CDN, we have 100 prediction for every image, So we could apply Hungarian algorithm to match the preds and labels for a batch. But if we use HO-pair decoder to replace Faster-RCNN in iCAN and then set a threshold to generate human-pairs, then may be one image generate k1 HO pairs, but another image generate k2 HO pairs (k1 is not equal to k2). Under these circumstances, we should apply Hungarian algorithm to match preds and labels for every image??

Sorry for long and complex questions.

YueLiao commented 2 years ago

Thx for your interest in our work!

  1. Due to the mAP evaluation metric in the HOI area just like object detection, it is very common that the number of predictions is much more than GTs' where we will sort all predictions by confidence and then verify if existing GT matches with it. More details can be seen in the evaluation metric implementation. Additionally, you could turn the IoU threshold in PNMS to filter more predictions that may not cause AP to lose a lot.
  2. Sorry, I could not clearly get your points. We first train HOPD and then extract the H-O pairs offline. Next, we replace the H-O pairs with original iCAN H-O pairs. The different numbers of predictions across different images are common.
Devotedyang commented 2 years ago

the second question, I mean, after you replace the HO pairs with original iCAN HO pairs, the training of the whole model is also using Hungarian algorithm to match preds and labels ?

YueLiao commented 2 years ago

We train iCAN also with the original H-O pairs, and we only replace the H-O pairs provided by HO-PD during evaluation, more details can be in #14 .

Devotedyang commented 2 years ago

Thanks!!!

hutuo1213 commented 1 year ago

Hello author! In response to the first question, I found that many of the predictions top_k in the article meet the conditions, but the repeated predictions, which makes the mAP lower, is this your original intention for using PNMS?