Closed roeiherz closed 4 years ago
Thanks for your interest. Actually the spatial relationships part is a remnant of the original CLEVR code, and we don't really support it in CATER. This is partly because spatial relationships are expected to change continuously over time in a video; and also our main focus in this work is on exploring the temporal relationships between the actions. Please disregard the spatial relationships part of the structure for CATER.
Hi,
Thanks for your interesting work.
Could you please explain the spatial relationships data?: F = json.load(open('max2action/scenes/CATER_new_004617.json')) len(F['relationships']['behind']) = 56 The number of frames: 40
The original spatial relations should be: if j is in F['relationships']['behind'][i] then object j is behind of object i (see here). However, here the index i is not an object (there aren't 56 objects in the scene). Could you clarify it?.
Thanks,
Originally posted by @roeiherz in https://github.com/rohitgirdhar/CATER/issues/3#issuecomment-584618788