KaihuaTang / Scene-Graph-Benchmark.pytorch

A new codebase for popular Scene Graph Generation methods (2020). Visualization & Scene Graph Extraction on custom images/datasets are provided. It's also a PyTorch implementation of paper “Unbiased Scene Graph Generation from Biased Training CVPR 2020”
MIT License
1.03k stars 228 forks source link

About filtering duplicate relations for VG dataset #129

Open coldmanck opened 3 years ago

coldmanck commented 3 years ago

It seems to me the following code snippet doesn't work as expected:

https://github.com/KaihuaTang/Scene-Graph-Benchmark.pytorch/blob/d0ffa40d92133d7d865e531146de82c8c8a344c0/maskrcnn_benchmark/data/datasets/visual_genome.py#L148-L156

I was thinking filtering out duplicate relations means for those exactly repeated relation triplets (i.e., not only subject and object are the same but also the predicate); however, this snippet seems to preserve only a single predicate for each object pair (with a higher chance for those occurring more times to be chosen). This seems unreasonable for me and makes the following snippet redundant:

https://github.com/KaihuaTang/Scene-Graph-Benchmark.pytorch/blob/d0ffa40d92133d7d865e531146de82c8c8a344c0/maskrcnn_benchmark/data/datasets/visual_genome.py#L162-L164

To accommodate multiple labels for each object pair, I think we have to change L148-L156 to the following:

if self.filter_duplicate_rels:
    # Filter out dupes!
    assert self.split == 'train'
    old_size = relation.shape[0]
    all_rel_sets = defaultdict(set)
    for (o0, o1, r) in relation:
        all_rel_sets[(o0, o1)].add(r)
    relation = [(k[0], k[1], v) for k, vs in all_rel_sets.items() for v in vs]
    relation = np.array(relation, dtype=np.int32)
coldmanck commented 3 years ago

Hi @KaihuaTang 麻烦花时间看下哈,我认为是个蛮明显的 Bug,我也提供修复 patch 开了 PR 了,帮助之后的人不犯相同错误

KaihuaTang commented 3 years ago

@coldmanck Thank you for your advice. I also doubted it when I saw these codes in the previous implementations. However, on second thought, I realize that the current SGG setting doesn't allow a pair of objects to take more than one predicates, because if a pair of objects have more than one predicate, its prediction will always be penalized by cross-entropy loss.

The current version is not an ideal implementation, but your alternative code also doesn't fit the multi-class setting unless you change the SGG to the multi-label classification task and replace all the CE losses.

coldmanck commented 3 years ago

Hi @KaihuaTang Thank you for your reply. Actually, the current version (of randomly sampling a predicate if a pair got multiple predicates; L162-164) is just nice. A recent NeurIPS 2020 paper [1] shows that, to optimize Recall@K given multilabel classification, for reduction of multilabel to binary/multi-class classification problem we should choose either (a) One-versus-all normalized, (b) Pick-all-labels normalized, or (c) Pick-one-label. Among them (c) is exactly what has been done by default in this SGG implementation. We don't necessarily need a multilabel loss (such as (a)); instead, a multiclass cross-entropy (such as (b) or (c)) also works! :)

By the way, my code fixes the bug at L148-156 and still keeps the default setting because of Line 162-164. My PR paves the right way for other people trying to extend into a multi-label setting.

[1] Menon, Aditya K., et al. "Multilabel reductions: what is my loss optimising?." Advances in Neural Information Processing Systems 32 (2019): 10600-10611.