rohitgirdhar / CATER

CATER: A diagnostic dataset for Compositional Actions and TEmporal Reasoning
https://rohitgirdhar.github.io/CATER/
Apache License 2.0
103 stars 19 forks source link

instructions of json format data in the pre-generated scene dataset #3

Closed hehefan closed 4 years ago

hehefan commented 4 years ago

Hi,

Is there any instruction for the json format data in the pre-generated scene set?

Thanks.

rohitgirdhar commented 4 years ago

Thanks for your interest. You can refer to the generation script for full details on the format.

Here's an example of perhaps some of the more important elements:

>>> F = json.load(open('all_actions/scenes/CATER_new_000822.json'))
>>> F['movements']  # For each object, contains all the motions that object goes through, including the start and end time point of those actions
>>> F['movements']['SmoothCube_v2_1'][0]
[u'_rotate', None, 9, 39]  # The first motion this object went through was a rotate from frame 9 to 39
>>> F['movements']['Cone_1'][3]
[u'_contain', u'Cone_0', 118, 144]  # The 2nd term specifies the object that is contained. It's None if nothing was contained. 
>>>  len(F['objects'])  # Contains a structure for each object
8
>>> F['objects'][0].keys()
[u'pixel_coords', u'color', u'material', u'locations', u'sized', u'instance', u'shape', u'3d_coords', u'rotation', u'size']
>>> F['objects'][0]['shape']
spl  # `spl` is used for the snitch object. Other shape names should be obvious
>>> F['objects'][0]['locations']  # Will give a list of positions of the object on the board throughout the simulation, in (x,y,z) coordinates. The x and y range from -3 to +3.
roeiherz commented 4 years ago

Hi,

Thanks for your interesting work.

Could you please explain the spatial relationships data?: F = json.load(open('max2action/scenes/CATER_new_004617.json')) len(F['relationships']['behind']) = 56 The number of frames: 40

The original spatial relations should be: if j is in F['relationships']['behind'][i] then object j is behind of object i (see here). However, here the index i is not an object (there aren't 56 objects in the scene). Could you clarify it?.

Thanks,