long8v / PTIR

Paper Today I Read
19 stars 0 forks source link

[45] BGT-Net: Bidirectional GRU Transformer Network for Scene Graph Generation #51

Open long8v opened 1 year ago

long8v commented 1 year ago

image

paper

TL;DR

Details

Architecture

image

object proposal 넣어서 relation 예측하는데 예측하는 방식은 3.3 BA부분에 설명되어 있음. $d = Wp * u{i,j}$

$p_{i,j} = softmax(W_r(o_i'o_j'u{i,j}) + d \odot \tilde p{i->j}$

결론적으로 argmax해서 relation 구함. image

Frequency Softening

VG 데이터셋이 long-tail이기 때문에, 마지막 softmax 단의 확률에 log를 취해줌 image

Results

image