a BGNN MODEL crucial bug need to be fixed

rafa-cxg commented 2 years ago

Hi there, I find a bug in may cause big mistake: In _pysgg/modeling/roi_heads/relation_head/rel_proposal_network/loss.py , "loss_eval_hybrid_level"_ function mistakenly regard the last bit-logit as background ,however , following code in PYSGG regards first bit-logit as background original code: mulitlabel_logits = selected_cls_logits[:, :-1] bin_logits = selected_cls_logits[:, -1] and change this to: mulitlabel_logits = selected_cls_logits[:, 1:] bin_logits = selected_cls_logits[:, 0] this is mean_recall@100 result. hightest point is 30.5, and result is very unstable:

this is result after changing: highest point is 31. of course, still not good as author mentioned in paper.(I wonder why we can not get mentioned result?)

After fixing this bug , the evaluation result notably better than before.

Scarecrow0 commented 2 years ago

First, it is not a mistake; our relationship confidence estimation module has both multi-class and binary relationship confidence prediction. We concatenate the logits of two-level prediction in the [RelAwareRelFeature module] (https://github.com/SHTUPLUS/PySGG/blob/main/pysgg/modeling/roi_heads/relation_head/rel_proposal_network/models.py#L735). As you can see, the multi-label is the foremost part of the tensor and the binary prediction is in the last dimension, which corresponds to slicing in loss calculation. It is quite interesting that this modification won't lead to a large performance drop (optimal performance at 10k iterations according to your tensorboard plot). I think the negative connections are less in predcls since the only GT entities are used for pairing.
Second, since there are only a few samples for a large number of rare classes, it may lead to more performance varanice in mR@100. This is a normal phenomenon in longtail recognition. We cannot achieve the exact same performance as reported in the paper. Besides, the performance plotted in training time is the validation set, not the test set, and I don't know the PyTorch version and batch size or other environmental parameters you use. You can refer to this issue to obtain more details on predcls. I provide the config and trained parameters of our model in predcls, which has closed performance as reported in the paper.

rafa-cxg commented 2 years ago

Thanks for your patient. I realized that, my negligence

SHTUPLUS / PySGG

a BGNN MODEL crucial bug need to be fixed #8