YangLing0818 / SGDiff

Official implementation for "Diffusion-Based Scene Graph to Image Generation with Masked Contrastive Pre-Training" https://arxiv.org/abs/2211.11138
51 stars 6 forks source link

scene graph encoding choice #11

Open kumarmanas opened 6 months ago

kumarmanas commented 6 months ago

From paper, it seems that scene graph is in form of text triplet and you encode the text triplet using Graph encoder. Is my assumption true or image features is also used for Scene graph encoding? if yes what kind of graph model you are using to encode textual scene graph information. From code it seems like BERT is used for processing of text in the scene graph.