long8v / PTIR

Paper Today I Read

19 stars 0 forks source link

[87] Bipartite Graph Network with Adaptive Message Passing for Unbiased Scene Graph Generation #96

Open long8v opened 1 year ago

long8v commented 1 year ago

TL;DR

I read this because.. : #75 에서 성능 향상이 꽤 있었음. 최근 AAAI에 나온 논문 중에 baseline이 해당 논문인게 있었음.
task : two-stage SGG
problem : sgg 데이터가 long-tail.
idea : confidence-aware bipartite graph neural network 제안. bi-level data resampling strategy.
architecture : relationship confidence estimation(RCE)와 confidence-aware message propagation(CMP)의 조합
objective : predicate와 entity의 ce loss, loss for relation confidence estimation(class-specific / overall)
baseline : graph-RCNN, GPS-Net, Motif, ...
data : Visual Genome, Open Images V4/6
evaluation : PredCls, SGCls, SGGen(head, body, tail), OI evaluation
result : sota. tail 점수가 많이 좋아짐.
contribution : confidence aware? gnn for sgg 논문들을 잘 몰라서 뭐가 contribution인지 모르겠음
limitation / things I cannot understand : confidence가 무슨 역할을 하는지? confidence에 loss를 직접적으로 준 것 같은데 어떻게 준건지? graph-RCNN에서 "relatedness" 준 것 처럼 준건가?

Details

Architecture

Proposal generation network

Faster RCNN으로 object들 뽑고 거기서 visual feature $v_i$, geometric feature $g_i$, class word embedding feature $w_i$를 가지고 entity 표현 $e_i$를 만듦

relation representation $r_{i->j}$는 entity 표현 $e_i$, $e_j$를 결함해서 만듦. $u_i,j$는 두 entity의 union region의 convolutional feature.

Bipartite Graph Neural Network

1) Relationship Confidence Estimation Module 각 entity $e_i$, $e_j$의 class probability를 가지고 confidence를 구함.

(???) 이부분 이해가 안됨 어느 점에서 global인지?

2) Confidence-aware message

entity-to-predicate
predicate-to-entity

$\alpha$, $\beta$는 theshold parameter.

each entity node $e_i$ by aggregating neighbors' messages

Scene Graph Prediction

Bi-level Resampling

1) image-level over-sampling repeat factor를 구해서 안나온 class에 대한 이미지가 들어있으면 그 이미지 더 많이 뽑은듯. $r^c=max(1, \sqrt(t/f^c))$

$c$ : category
$f_c$ : frequency of category c on the entire dataset
$t$ : hyperparam

2) instance-level under-sampling 각 이미지의 다른 predicate class에 따라 instance를 없앤듯. -> Iterative SGG는 one-stage인데 이거 어떻게 했지? gt label에서 그냥 지운건가

Result