Why disabling score fusion get even better results?

makecent commented 2 years ago

My understanding is that enabling score fusion will use the external classification score, which should lead to better results. However, when I disabling the score fusion, i.e.,:

# when using external scores, our model is generating "proposals"
# multiclass_nms: False,
# ext_score_file: ./data/thumos/annotations/thumos14_cls_scores.pkl,
# comment out L47-48 and uncomment L50 to disable score fusion
multiclass_nms: True,

The eval result is:

|tIoU = 0.30: mAP = 81.87 (%)
|tIoU = 0.40: mAP = 77.58 (%)
|tIoU = 0.50: mAP = 71.65 (%)
|tIoU = 0.60: mAP = 57.75 (%)
|tIoU = 0.70: mAP = 43.07 (%)
Avearge mAP: 66.38 (%)

While when enabling the score fusion, i.e.,:

# when using external scores, our model is generating "proposals"
multiclass_nms: False,
ext_score_file: ./data/thumos/annotations/thumos14_cls_scores.pkl,
# comment out L47-48 and uncomment L50 to disable score fusion
# multiclass_nms: True,

I got a worse result:

|tIoU = 0.30: mAP = 74.82 (%)                  
|tIoU = 0.40: mAP = 71.34 (%)                
|tIoU = 0.50: mAP = 65.70 (%)            
|tIoU = 0.60: mAP = 55.28 (%)                   
|tIoU = 0.70: mAP = 41.97 (%)                
Avearge mAP: 61.82 (%)

BTW, I did NOT retrain the model after revising the yaml because I think this should only do with the testing.

tzzcl commented 2 years ago

I guess a major performance drop may come from the duplicate action part, i.e., Diving and Cliff Diving as in #10. When we disable the score fusion, we can provide predictions for both Diving and Cliff Diving. But when we enable the score fusion, one category may be supressed.

makecent commented 2 years ago

@tzzcl According to #10, if we disable the score fusion, actionformer can not predict Cliff_Diving at all as it never saw an training sample from this category.

I still don't get the reason why fusion with external score will get worse result. And if it's true, why even bothering ourselves with the external score?

tzzcl commented 2 years ago

When I refer to #10, I mean that there exists multiple labels for the same action. ActionFormer cannot handle this situation previously, but we fix that issue in a commit. Thus, when we disable the score fusion, ActionFormer can predict multiple action labels with same interval. When we enable the score fusion, we may can not do this part.

makecent commented 2 years ago

I see. It's great to get rid of the external scores.

Thanks.

happyharrycn / actionformer_release

Why disabling score fusion get even better results? #48