zhuang-li / FactualSceneGraph

FACTUAL benchmark dataset, the pre-trained textual scene graph parser trained on FACTUAL.
https://arxiv.org/pdf/2305.17497.pdf
95 stars 12 forks source link

Verb indexes and attribute indexes #3

Closed KuofengGao closed 10 months ago

KuofengGao commented 10 months ago

Scene Graph Parsers with node indexes enable the identification of entity positions, but the challenge remains in determining the positions of verbs or attributes within sentences. For example, for verbs, "a man is wearing a shirt and a woman is wearing a plant", how can we locate the index of wear. For example, for attributes, "a red cat, a blue dog and a red man", how can we locate the index of red?

zhuang-li commented 10 months ago

Hi, I think indexing may not be essential for verbs or attributes that are repeated within a sentence. For example, in "a man is wearing a shirt and a woman is wearing a plant," the verb "wear" denotes the same action, regardless of who performs it. Similarly, "red" represents the same color in "a red cat, a blue dog, and a red man," which might make separate indexes for "red" redundant. However, consider the sentence "a man is wearing a shirt and a man is wearing a plant." In an image, the two 'men' would be within two different bounding boxes. To align each entity in the sentence with its image representation (although such alignment is not addressed in the current work, which I believe should be explored in future research), it would be necessary to have indexes to differentiate the two 'men'.

KuofengGao commented 10 months ago

Thank you for your attention which solves my concern!