rohitgirdhar / CATER

CATER: A diagnostic dataset for Compositional Actions and TEmporal Reasoning
https://rohitgirdhar.github.io/CATER/
Apache License 2.0
103 stars 19 forks source link

Spatial Relationships #6

Closed roeiherz closed 4 years ago

roeiherz commented 4 years ago

Hi,

Thanks for your interesting work.

Could you please explain the spatial relationships data?: F = json.load(open('max2action/scenes/CATER_new_004617.json')) len(F['relationships']['behind']) = 56 The number of frames: 40

The original spatial relations should be: if j is in F['relationships']['behind'][i] then object j is behind of object i (see here). However, here the index i is not an object (there aren't 56 objects in the scene). Could you clarify it?.

Thanks,

Originally posted by @roeiherz in https://github.com/rohitgirdhar/CATER/issues/3#issuecomment-584618788

rohitgirdhar commented 4 years ago

Thanks for your interest. Actually the spatial relationships part is a remnant of the original CLEVR code, and we don't really support it in CATER. This is partly because spatial relationships are expected to change continuously over time in a video; and also our main focus in this work is on exploring the temporal relationships between the actions. Please disregard the spatial relationships part of the structure for CATER.