Open XuMengyaAmy opened 3 years ago
We also noticed this problem before. It looks like precision rather than recall. However, such an implementation is directly adopted from the previous popular methods (iterative message passing and Neural MOTIFS), so we have to use the same calculation for fair comparisons.
Thanks for your reply. However, from my understanding, it did not look like precision or recall. Because for Precision and Recall, each class needs to calculate its Precision and Recall separately. It's just a simple Accuracy calculation. Are you able to provide mAP, AUC, True Recall calculation code, and results by using sklearn.metrics.average_precision_score and sklearn.metrics.precision_recall_curve function by passing proper y_true and y_score?
sorry, i'm no longer working on this project for a long time. You can add the corresponding code at https://github.com/KaihuaTang/Scene-Graph-Benchmark.pytorch/blob/master/maskrcnn_benchmark/data/datasets/evaluation/vg/vg_eval.py
❓ Questions and Help
Hi, thanks for your work. You highlight the importance of the metrics used. However, I have confusion about how you calculate the SGRecall in Scene-Graph Benchmark/maskrcnn_benchmark/data/datasets/evaluation/vg/sgg_eval,py. rec_i = float(len(match)) / float(gt_rels.shape[0])
when focus on relationship prediction (-precls-exmp), you did not calculate the recall for the specific relationship class. I feel the recall is more like Accuracy calculation. For Precision and Recall, each class needs to calculate its Precision and Recall separately (Recall = TP/ (TP+FN)). Did I miss anything? Thanks for your help!
Mengya Xu