Closed xxxzhi closed 4 years ago
Our reported 22.12 mAP for HAKE-HICO-DET (PaStaNet in paper) is composed of multiple components: The result of the Instance-level PaStaNet is fused with the result from TIN and our image-level PaStaNet, then applied with Non-Interactive Suppression (NIS) trained on HAKE. The vanilla TIN with Non-Interactive Suppression (NIS) trained on HAKE would achieve 18.33 mAP, while if enhanced with our image-level PaStaNet it will reach 21.60 mAP. If we test only the result of model 'res50_faster_rcnn_iter_1190000.ckpt' with other parameters randomly initialized (we use it as an approximation of 'test_pastanet/HOI_iter_10.ckpt' in this issue), we will get an mAP of 2.91 with Non-Interactive Suppression (NIS) trained on HAKE, which is trivial, while our Instance-level PaStaNet will achieve 19.52 mAP under the same setting. Thus we can see the reported result in this issue is mainly credited to TIN and our image-level PaStaNet. Also note that the combination of different high-performance models and techniques usually won't result in linear improvement. And the higher performance each individual model achieves, the harder it is to improve the performance with combination, as their overlaps could be larger. However, with more data and knowledge, our HAKE-Large (PaStaNet in paper) manages to improve the performance considerably, showing the importance of our HAKE-Large data.
Well,
!!! The result of 'test_pastanet/HOI_iter_1.ckpt' is:
Default: 0.22061835057332763
Default rare: 0.20459114486066238
Default non-rare: 0.2254056977342536
Known object: 0.24004048942292627
Known object, rare: 0.2247353598730576
Known object, non-rare: 0.24461215149626364
I think this is an approximation of randomly initialized model in my experiment. Is it because I possibly make some mistakes? Or Somethings (e.g. parts knowledge) that you provided have been optimized?
which is trivial, while our Instance-level PaStaNet* will achieve 19.52 mAP under the same setting.
Here, do you mean Instance-level PaStaNet* achieve 19.52 under radomly initialized model?
Btw, compared to your result, TIN might be useless because we directly achieve around 18.2 without Non-Interactive Suppression based on TIN code. One of very useful parts is re-weighting and it is also very simple.
I run the code with python3. So I change the pickle format like this,
Trainval_GT = pickle.load(open(cfg.DATA_DIR + '/' + 'Trainval_GT_10w.pkl', "rb"), encoding='latin1') Trainval_N = pickle.load(open(cfg.DATA_DIR + '/' + 'Trainval_Neg_10w.pkl', "rb"), encoding='latin1')
Other part is unchanged. I can also upload the code to github. You might find out the problem if your randomly initialized model is worse much than 22.00.
Thanks.
Well,
!!! The result of 'test_pastanet/HOI_iter_1.ckpt' is:
Default: 0.22061835057332763 Default rare: 0.20459114486066238 Default non-rare: 0.2254056977342536 Known object: 0.24004048942292627 Known object, rare: 0.2247353598730576 Known object, non-rare: 0.24461215149626364
I think this is an approximation of randomly initialized model in my experiment. Is it because I possibly make some mistakes? Or Somethings (e.g. parts knowledge) that you provided have been optimized?
which is trivial, while our Instance-level PaStaNet* will achieve 19.52 mAP under the same setting.
Here, do you mean Instance-level PaStaNet* achieve 19.52 under radomly initialized model?
Btw, compared to your result, TIN might be useless because we directly achieve around 18.2 without Non-Interactive Suppression based on TIN code. One of very useful parts is re-weighting and it is also very simple.
The reported 22.06 mAP in the comment contains the effect of 'test_pastanet/HOI_iter_1.ckpt', TIN, NIS trained on HAKE, our optimized Image-level PaStaNet* result and our re-weighting strategy. If 'test_pastanet/HOI_iter_1.ckpt' is removed, the result will be approximately 21.6 mAP. If 'test_pastanet/HOI_iter_1.ckpt' is replaced with our trained PaStaNet model, it will achieve 22.66 mAP. Also note that the improvement that the fusion between models brings are usually not linear, and our Instance-level model suffers from the overlap with our Image-level model a lot, especially for fusion.
The 19.52 mAP is achieved by our trained PaStaNet model without fusion with TIN, NIS and Image-level PaStaNet. And the 2.91 mAP is achieved by 'res50_faster_rcnn_iter_1190000.ckpt' (all parameters that aren't in the checkpoint are randomly initialized) without fusion with TIN, NIS and Image-level PaStaNet*. Sorry for the ambiguity.
Noticing that current HOI detection model might produce irrational classification scores for rare categories, we perform grid search on selected validation set from HAKE to find our reweighting factors. They did help a lot, however, even with them, the vanilla 'res50_faster_rcnn_iter_1190000.ckpt' (without fusion with TIN, NIS and Image-level PaStaNet*) still performs poorly as shown in (2). Therefore, the well-trained TIN model is still important.
Thanks for your reply. This information is important and helpful!
we perform grid search on selected validation set from HAKE to find our reweighting factors.
oh, thx. I have noticed the HO_weight is different from TIN.
The 19.52 mAP is achieved by our trained PaStaNet model without fusion with TIN, NIS and Image-level PaStaNet
So, you still include Image-level PaStaNet* in the final result? Would you mind provide the full result? I mean the Rare category and UnRare category. The full is 19.52, the rare is ?
I'm still confused about the performance of test_pastanet/HOI_iter_1.ckpt
. In other word. The training step (python tools/Train_pasta_HICO_DET.py --data 0 --init_weight 1 --train_module 2 --num_iteration 11 --model test_pastanet) is unnecesary when we fuse the model with TIN, NIS, Image-level PaStaNet? That seems like ok cause you have used Image-level PaStaNet. What's the result of the model without Image-level PaStaNet*? This would be also helpful. Your final result looks like a ensemble of multiple models.
Anyway. Thanks for your information very much.
It achieves 17.29 mAP on the Rare set and 20.19 mAP on the Un-rare set. And we are working on the journal version of PaStaNet, in which we improve the result of the model without Image-level PaStaNet. It will be made public soon.
Thanks for your released code. However, I faces a strage problem that the HAKE-Action seems do not need to train. I run the code using the command:
Then, I test the model 'test_pastanet/HOI_iter_10.ckpt'.
Unbelievably! The result is:
Have you ever tested your code like this? This result is ....... I'm trying to test snapshot 1. I do not change the code.