instance_detections2D - Githubissues

pioneer-innovation commented 2 years ago

Hi AI2THOR team, I want to obtain the 2D detection bounding boxes of visible objects from event.instance_detections2D. I followed the ai2thor tutorial to set up a demo, but I found that there is only an attribute named instance_masks in event.instance_detections2D and there is no information about detection bounding boxes. Here is my code:

from ai2thor.controller import Controller

controller = Controller(
    agentMode="default",
    visibilityDistance=1.5,
    scene="FloorPlan212",

    gridSize=0.25,
    snapToGrid=True,
    rotateStepDegrees=90,

    renderDepthImage=False,
    renderInstanceSegmentation=True,
    renderObjectImage=True,

    width=300,
    height=300,
    fieldOfView=90
)

event = controller.step(action="RotateLeft")
print(event.instance_detections2D.__dict__)
print(event.instance_detections2D.instance_masks.__dict__)
print("\n")

event = controller.step(action="RotateLeft")
print(event.instance_detections2D.__dict__)
print(event.instance_detections2D.instance_masks.__dict__)
print("\n")

event = controller.step(action="RotateLeft")
print(event.instance_detections2D.__dict__)
print(event.instance_detections2D.instance_masks.__dict__)
print("\n")

event = controller.step(action="RotateLeft")
print(event.instance_detections2D.__dict__)
print(event.instance_detections2D.instance_masks.__dict__)
print("\n")

I have tried many steps with different actions, but I still can not find any information about 2D detection boxes in event.instance_detections2D. Here is the print information:

{'instance_masks': <ai2thor.server.LazyInstanceSegmentationMasks object at 0x7f6a05295dc0>, '_detections2d': {}}
{'_masks': {}, '_loaded': False, '_unique_integer_keys': None, '_empty_mask': None, 'instance_segmentation_frame_uint32': array([[4290875010, 4290875010, 4290875010, ..., 4290875010, 4290875010,
        4290875010],
       [4290875010, 4290875010, 4290875010, ..., 4290875010, 4290875010,
        4290875010],
       [4290875010, 4290875010, 4290875010, ..., 4290875010, 4290875010,
        4290875010],
       ...,
       [4284752823, 4284752823, 4284752823, ..., 4281420172, 4281420172,
        4281420172],
       [4284752823, 4284752823, 4284752823, ..., 4281420172, 4281420172,
        4281420172],
       [4284752823, 4284752823, 4284752823, ..., 4281420172, 4281420172,
        4281420172]], dtype=uint32), '_alpha_channel_value': 255, 'instance_colors': {'Window|+00.02|+02.07|+02.49': [116, 71, 187], 'Window|+01.57|+02.07|+02.49': [212, 67, 47], 'Wall|-1|0|-1.05': [130, 142, 193], 'FP212:StandardDoorFrame1  1|-1.45|0|1.2': [211, 123, 84], 'FP212:StandardDoorFrame1 |-4.25|0|-0.75': [222, 187, 121], 'FP212:StandardDoorFrame1|-1.85|0|-1.45': [158, 12, 253], 'FP212:StandardKnob1  1|-1.45|0|1.2': [244, 121, 100], 'FP212:StandardDoor1  1|-1.45|1|0.775': [147, 159, 70], 'FP212:StandardKnob1 |-4.25|0|-0.75': [91, 85, 186], 'FP212:StandardDoor1 |-4.25|1|-1.175': [30, 25, 50], 'FP212:StandardKnob1|-1.85|0|-1.45': [131, 46, 179], 'FP212:StandardDoor1|-1.425|1|-1.45': [111, 225, 149], 'FP212:LightFixture1  5|0.008|3.6|-0.436': [41, 40, 77], 'FP212:LightFixture1  4|1.008|3.6|-0.436': [74, 169, 46], 'FP212:LightFixture1  3|2.008|3.6|-0.436': [213, 47, 127], 'FP212:LightFixture1  2|2.481|3.6|-0.436': [75, 26, 228], 'FP212:LightFixture1  1|1.481|3.6|-0.436': [165, 20, 37], 'FP212:LightFixture1 |0.481|3.6|-0.436': [18, 224, 254], 'Ceiling|0|3.8|0': [71, 243, 117], 'FP212:Fireplace|0|0|0': [17, 237, 232], 'Floor|+00.00|+00.00|+00.00': [88, 131, 103], 'Painting|+04.07|+01.95|+00.85': [172, 246, 191], 'Television|+01.90|+01.28|-00.84': [219, 211, 179], 'Laptop|+01.80|+00.47|+00.50': [62, 216, 95], 'FloorLamp|+03.61|+00.00|+02.16': [141, 119, 166], 'TissueBox|+03.92|+00.87|+00.68': [184, 67, 215], 'LightSwitch|-01.40|+01.29|+01.84': [194, 176, 88], 'RemoteControl|+01.88|+00.33|+01.73': [165, 165, 22], 'HousePlant|+00.39|+00.80|-00.73': [195, 50, 9], 'Newspaper|+02.15|+00.41|-00.72': [254, 186, 161], 'Boots|+04.00|+00.00|+01.70': [225, 185, 33], 'WateringCan|+01.62|+00.02|-00.70': [111, 91, 236], 'KeyChain|+01.50|+00.47|+00.53': [208, 10, 20], 'CreditCard|+01.41|+00.47|+00.65': [60, 92, 152], 'Pen|+03.93|+00.87|+01.04': [210, 232, 241], 'GarbageCan|+03.83|-00.03|-00.50': [145, 121, 57], 'Pencil|+03.89|+00.87|+01.18': [27, 124, 5], 'Pillow|+00.65|+00.39|+01.71': [80, 74, 115], 'ArmChair|+02.66|+00.00|+01.86': [222, 236, 138], 'CoffeeTable|+01.59|00.00|+00.45': [105, 204, 158], 'Statue|-00.54|+00.40|-00.69': [154, 183, 247], 'Statue|-00.09|+00.03|-00.70': [202, 142, 60], 'TVStand|-00.29|00.00|-00.77': [140, 73, 49], 'Shelf|+01.91|+00.20|-00.73': [186, 61, 199], 'Shelf|+01.91|+00.59|-00.73': [68, 73, 158], 'Shelf|-00.29|+00.59|-00.73': [193, 106, 55], 'TVStand|+01.90|00.00|-00.77': [183, 35, 100], 'Shelf|-00.29|+00.20|-00.73': [88, 120, 178], 'ArmChair|-00.27|+00.00|+01.87': [150, 118, 31], 'Box|-00.47|+01.04|-00.71': [52, 209, 176], 'Sofa|+01.19|+00.01|+01.87': [173, 247, 203], 'Drawer|+03.88|+00.77|+00.86': [2, 114, 123], 'SideTable|+03.95|+00.00|+00.86': [242, 161, 247]}, 'class_colors': {'Window': [[116, 71, 187], [200, 150, 134], [212, 67, 47]], 'Wall': [[130, 142, 193], [148, 232, 164]], 'FP212:StandardDoorFrame1  1': [[211, 123, 84]], 'Door': [[244, 78, 20]], 'FP212:StandardDoorFrame1 ': [[222, 187, 121]], 'FP212:StandardDoorFrame1': [[158, 12, 253]], 'FP212:StandardKnob1  1': [[244, 121, 100]], 'FP212:StandardDoor1  1': [[147, 159, 70]], 'FP212:StandardKnob1 ': [[91, 85, 186]], 'FP212:StandardDoor1 ': [[30, 25, 50]], 'FP212:StandardKnob1': [[131, 46, 179]], 'FP212:StandardDoor1': [[111, 225, 149]], 'FP212:LightFixture1  5': [[41, 40, 77]], 'CeilingLight': [[230, 113, 120]], 'FP212:LightFixture1  4': [[74, 169, 46]], 'FP212:LightFixture1  3': [[213, 47, 127]], 'FP212:LightFixture1  2': [[75, 26, 228]], 'FP212:LightFixture1  1': [[165, 20, 37]], 'FP212:LightFixture1 ': [[18, 224, 254]], 'Ceiling': [[71, 243, 117], [50, 55, 251]], 'FP212:Fireplace': [[17, 237, 232]], 'FirePlace': [[250, 76, 248]], 'Floor': [[88, 131, 103], [243, 246, 208]], 'Painting': [[172, 246, 191], [40, 117, 236]], 'Television': [[219, 211, 179], [27, 245, 217]], 'Laptop': [[62, 216, 95], [20, 107, 222]], 'FloorLamp': [[141, 119, 166], [253, 73, 35]], 'TissueBox': [[184, 67, 215], [98, 43, 249]], 'LightSwitch': [[194, 176, 88], [11, 51, 121]], 'RemoteControl': [[165, 165, 22], [187, 19, 208]], 'HousePlant': [[195, 50, 9], [73, 144, 213]], 'TheHand': [[146, 100, 153]], 'Newspaper': [[254, 186, 161], [19, 196, 2]], 'Boots': [[225, 185, 33], [121, 126, 101]], 'WateringCan': [[111, 91, 236], [147, 67, 249]], 'KeyChain': [[208, 10, 20], [27, 54, 18]], 'CreditCard': [[60, 92, 152], [56, 235, 12]], 'Pen': [[210, 232, 241], [239, 130, 152]], 'GarbageCan': [[145, 121, 57], [225, 40, 55]], 'Pencil': [[27, 124, 5], [177, 226, 23]], 'Pillow': [[80, 74, 115], [217, 193, 130]], 'agent_neck_1': [[16, 24, 176]], 'agent_body_1 (shadows)': [[233, 94, 109]], 'agent_collarbone_1 (shadows)': [[222, 46, 143]], 'agent_head_1 (shadows)': [[39, 201, 198]], 'stretch_robot_base_code': [[22, 130, 39]], 'ArmChair': [[222, 236, 138], [96, 52, 68], [150, 118, 31]], 'agent_head_1': [[122, 92, 42]], 'stretch_robot_laser': [[7, 65, 186]], 'stretch_robot_mast': [[150, 75, 198]], 'stretch_robot_head_1': [[217, 90, 117]], 'agent_neck_3': [[60, 83, 243]], 'robot_tilt_link_n (shadows)': [[25, 193, 224]], 'CoffeeTable': [[105, 204, 158], [18, 14, 75]], 'agent_head_2 (shadows)': [[96, 154, 25]], 'stretch_robot_head_3 (shadows)': [[223, 182, 152]], 'Statue': [[154, 183, 247], [243, 75, 41], [202, 142, 60]], 'agent_collarbone_2': [[233, 207, 130]], 'agent_neck_2 (shadows)': [[229, 81, 183]], 'agent_head_2': [[209, 89, 45]], 'stretch_robot_head_2': [[234, 200, 134]], 'agent_neck_3 (shadows)': [[76, 55, 128]], 'stretch_robot_omniwheel': [[31, 254, 187]], 'robot_elbow_link': [[101, 147, 231]], 'robot_camera_link_n': [[139, 142, 98]], 'robot_finger_r': [[19, 64, 240]], 'robot_finger_l': [[216, 72, 223]], 'robot_ar_tag': [[57, 112, 32]], 'robot_cam_mount_n': [[219, 204, 228]], 'robot_battery_n': [[4, 101, 154]], 'robot_plate': [[89, 60, 102]], 'robot_main_wheel_r_n': [[30, 143, 133]], 'robot_shoulder_link': [[76, 25, 229]], 'robot_roll_link_n': [[69, 33, 38]], 'robot_gripper_link': [[31, 177, 195]], 'robot_forearm_link': [[163, 79, 72]], 'robot_main_wheel_l_n': [[62, 33, 100]], 'robot_main_body': [[22, 91, 128]], 'robot_wrist_link': [[26, 108, 12]], 'robot_tilt_link_n': [[224, 90, 45]], 'agent_collarbone_2 (shadows)': [[1, 149, 227]], 'TVStand': [[140, 73, 49], [94, 234, 136], [183, 35, 100]], 'Shelf': [[186, 61, 199], [39, 54, 158], [68, 73, 158], [193, 106, 55], [88, 120, 178]], 'stretch_robot_head_2 (shadows)': [[59, 159, 89]], 'stretch_robot_wheel_r': [[225, 234, 123]], 'agent_collarbone_3': [[77, 220, 150]], 'stretch_robot_wheel_l': [[141, 244, 80]], 'agent_collarbone_3 (shadows)': [[244, 32, 83]], 'agent_neck_1 (shadows)': [[153, 228, 139]], 'Box': [[52, 209, 176], [60, 252, 230]], 'agent_body_1': [[244, 3, 18]], 'stretch_robot_base': [[228, 20, 44]], 'robot_roll_link_n (shadows)': [[89, 184, 7]], 'agent_neck_2': [[189, 103, 136]], 'agent_collarbone_1': [[74, 214, 33]], 'Sofa': [[173, 247, 203], [82, 143, 39]], 'stretch_robot_head_3': [[22, 163, 224]], 'Drawer': [[2, 114, 123], [155, 30, 210]], 'SideTable': [[242, 161, 247], [202, 45, 114]], 'robot_camera_link_n (shadows)': [[210, 68, 193]]}}

{'instance_masks': <ai2thor.server.LazyInstanceSegmentationMasks object at 0x7f6a0513d310>, '_detections2d': {}}
{'_masks': {}, '_loaded': False, '_unique_integer_keys': None, '_empty_mask': None, 'instance_segmentation_frame_uint32': array([[4285920071, 4285920071, 4285920071, ..., 4290875010, 4290875010,
        4290875010],
       [4285920071, 4285920071, 4285920071, ..., 4290875010, 4290875010,
        4290875010],
       [4285920071, 4285920071, 4285920071, ..., 4290875010, 4290875010,
        4290875010],
       ...,
       [4287294686, 4287294686, 4287294686, ..., 4289975259, 4289975259,
        4289975259],
       [4287294686, 4287294686, 4287294686, ..., 4284752823, 4284752823,
        4284752823],
       [4287294686, 4287294686, 4287294686, ..., 4284752823, 4284752823,
        4284752823]], dtype=uint32), '_alpha_channel_value': 255, 'instance_colors': {'Window|+00.02|+02.07|+02.49': [116, 71, 187], 'Window|+01.57|+02.07|+02.49': [212, 67, 47], 'Wall|-1|0|-1.05': [130, 142, 193], 'FP212:StandardDoorFrame1  1|-1.45|0|1.2': [211, 123, 84], 'FP212:StandardDoorFrame1 |-4.25|0|-0.75': [222, 187, 121], 'FP212:StandardDoorFrame1|-1.85|0|-1.45': [158, 12, 253], 'FP212:StandardKnob1  1|-1.45|0|1.2': [244, 121, 100], 'FP212:StandardDoor1  1|-1.45|1|0.775': [147, 159, 70], 'FP212:StandardKnob1 |-4.25|0|-0.75': [91, 85, 186], 'FP212:StandardDoor1 |-4.25|1|-1.175': [30, 25, 50], 'FP212:StandardKnob1|-1.85|0|-1.45': [131, 46, 179], 'FP212:StandardDoor1|-1.425|1|-1.45': [111, 225, 149], 'FP212:LightFixture1  5|0.008|3.6|-0.436': [41, 40, 77], 'FP212:LightFixture1  4|1.008|3.6|-0.436': [74, 169, 46], 'FP212:LightFixture1  3|2.008|3.6|-0.436': [213, 47, 127], 'FP212:LightFixture1  2|2.481|3.6|-0.436': [75, 26, 228], 'FP212:LightFixture1  1|1.481|3.6|-0.436': [165, 20, 37], 'FP212:LightFixture1 |0.481|3.6|-0.436': [18, 224, 254], 'Ceiling|0|3.8|0': [71, 243, 117], 'FP212:Fireplace|0|0|0': [17, 237, 232], 'Floor|+00.00|+00.00|+00.00': [88, 131, 103], 'Painting|+04.07|+01.95|+00.85': [172, 246, 191], 'Television|+01.90|+01.28|-00.84': [219, 211, 179], 'Laptop|+01.80|+00.47|+00.50': [62, 216, 95], 'FloorLamp|+03.61|+00.00|+02.16': [141, 119, 166], 'TissueBox|+03.92|+00.87|+00.68': [184, 67, 215], 'LightSwitch|-01.40|+01.29|+01.84': [194, 176, 88], 'RemoteControl|+01.88|+00.33|+01.73': [165, 165, 22], 'HousePlant|+00.39|+00.80|-00.73': [195, 50, 9], 'Newspaper|+02.15|+00.41|-00.72': [254, 186, 161], 'Boots|+04.00|+00.00|+01.70': [225, 185, 33], 'WateringCan|+01.62|+00.02|-00.70': [111, 91, 236], 'KeyChain|+01.50|+00.47|+00.53': [208, 10, 20], 'CreditCard|+01.41|+00.47|+00.65': [60, 92, 152], 'Pen|+03.93|+00.87|+01.04': [210, 232, 241], 'GarbageCan|+03.83|-00.03|-00.50': [145, 121, 57], 'Pencil|+03.89|+00.87|+01.18': [27, 124, 5], 'Pillow|+00.65|+00.39|+01.71': [80, 74, 115], 'ArmChair|+02.66|+00.00|+01.86': [222, 236, 138], 'CoffeeTable|+01.59|00.00|+00.45': [105, 204, 158], 'Statue|-00.54|+00.40|-00.69': [154, 183, 247], 'Statue|-00.09|+00.03|-00.70': [202, 142, 60], 'TVStand|-00.29|00.00|-00.77': [140, 73, 49], 'Shelf|+01.91|+00.20|-00.73': [186, 61, 199], 'Shelf|+01.91|+00.59|-00.73': [68, 73, 158], 'Shelf|-00.29|+00.59|-00.73': [193, 106, 55], 'TVStand|+01.90|00.00|-00.77': [183, 35, 100], 'Shelf|-00.29|+00.20|-00.73': [88, 120, 178], 'ArmChair|-00.27|+00.00|+01.87': [150, 118, 31], 'Box|-00.47|+01.04|-00.71': [52, 209, 176], 'Sofa|+01.19|+00.01|+01.87': [173, 247, 203], 'Drawer|+03.88|+00.77|+00.86': [2, 114, 123], 'SideTable|+03.95|+00.00|+00.86': [242, 161, 247]}, 'class_colors': {'Window': [[116, 71, 187], [200, 150, 134], [212, 67, 47]], 'Wall': [[130, 142, 193], [148, 232, 164]], 'FP212:StandardDoorFrame1  1': [[211, 123, 84]], 'Door': [[244, 78, 20]], 'FP212:StandardDoorFrame1 ': [[222, 187, 121]], 'FP212:StandardDoorFrame1': [[158, 12, 253]], 'FP212:StandardKnob1  1': [[244, 121, 100]], 'FP212:StandardDoor1  1': [[147, 159, 70]], 'FP212:StandardKnob1 ': [[91, 85, 186]], 'FP212:StandardDoor1 ': [[30, 25, 50]], 'FP212:StandardKnob1': [[131, 46, 179]], 'FP212:StandardDoor1': [[111, 225, 149]], 'FP212:LightFixture1  5': [[41, 40, 77]], 'CeilingLight': [[230, 113, 120]], 'FP212:LightFixture1  4': [[74, 169, 46]], 'FP212:LightFixture1  3': [[213, 47, 127]], 'FP212:LightFixture1  2': [[75, 26, 228]], 'FP212:LightFixture1  1': [[165, 20, 37]], 'FP212:LightFixture1 ': [[18, 224, 254]], 'Ceiling': [[71, 243, 117], [50, 55, 251]], 'FP212:Fireplace': [[17, 237, 232]], 'FirePlace': [[250, 76, 248]], 'Floor': [[88, 131, 103], [243, 246, 208]], 'Painting': [[172, 246, 191], [40, 117, 236]], 'Television': [[219, 211, 179], [27, 245, 217]], 'Laptop': [[62, 216, 95], [20, 107, 222]], 'FloorLamp': [[141, 119, 166], [253, 73, 35]], 'TissueBox': [[184, 67, 215], [98, 43, 249]], 'LightSwitch': [[194, 176, 88], [11, 51, 121]], 'RemoteControl': [[165, 165, 22], [187, 19, 208]], 'HousePlant': [[195, 50, 9], [73, 144, 213]], 'TheHand': [[146, 100, 153]], 'Newspaper': [[254, 186, 161], [19, 196, 2]], 'Boots': [[225, 185, 33], [121, 126, 101]], 'WateringCan': [[111, 91, 236], [147, 67, 249]], 'KeyChain': [[208, 10, 20], [27, 54, 18]], 'CreditCard': [[60, 92, 152], [56, 235, 12]], 'Pen': [[210, 232, 241], [239, 130, 152]], 'GarbageCan': [[145, 121, 57], [225, 40, 55]], 'Pencil': [[27, 124, 5], [177, 226, 23]], 'Pillow': [[80, 74, 115], [217, 193, 130]], 'agent_neck_1': [[16, 24, 176]], 'agent_body_1 (shadows)': [[233, 94, 109]], 'agent_collarbone_1 (shadows)': [[222, 46, 143]], 'agent_head_1 (shadows)': [[39, 201, 198]], 'stretch_robot_base_code': [[22, 130, 39]], 'ArmChair': [[222, 236, 138], [96, 52, 68], [150, 118, 31]], 'agent_head_1': [[122, 92, 42]], 'stretch_robot_laser': [[7, 65, 186]], 'stretch_robot_mast': [[150, 75, 198]], 'stretch_robot_head_1': [[217, 90, 117]], 'agent_neck_3': [[60, 83, 243]], 'robot_tilt_link_n (shadows)': [[25, 193, 224]], 'CoffeeTable': [[105, 204, 158], [18, 14, 75]], 'agent_head_2 (shadows)': [[96, 154, 25]], 'stretch_robot_head_3 (shadows)': [[223, 182, 152]], 'Statue': [[154, 183, 247], [243, 75, 41], [202, 142, 60]], 'agent_collarbone_2': [[233, 207, 130]], 'agent_neck_2 (shadows)': [[229, 81, 183]], 'agent_head_2': [[209, 89, 45]], 'stretch_robot_head_2': [[234, 200, 134]], 'agent_neck_3 (shadows)': [[76, 55, 128]], 'stretch_robot_omniwheel': [[31, 254, 187]], 'robot_elbow_link': [[101, 147, 231]], 'robot_camera_link_n': [[139, 142, 98]], 'robot_finger_r': [[19, 64, 240]], 'robot_finger_l': [[216, 72, 223]], 'robot_ar_tag': [[57, 112, 32]], 'robot_cam_mount_n': [[219, 204, 228]], 'robot_battery_n': [[4, 101, 154]], 'robot_plate': [[89, 60, 102]], 'robot_main_wheel_r_n': [[30, 143, 133]], 'robot_shoulder_link': [[76, 25, 229]], 'robot_roll_link_n': [[69, 33, 38]], 'robot_gripper_link': [[31, 177, 195]], 'robot_forearm_link': [[163, 79, 72]], 'robot_main_wheel_l_n': [[62, 33, 100]], 'robot_main_body': [[22, 91, 128]], 'robot_wrist_link': [[26, 108, 12]], 'robot_tilt_link_n': [[224, 90, 45]], 'agent_collarbone_2 (shadows)': [[1, 149, 227]], 'TVStand': [[140, 73, 49], [94, 234, 136], [183, 35, 100]], 'Shelf': [[186, 61, 199], [39, 54, 158], [68, 73, 158], [193, 106, 55], [88, 120, 178]], 'stretch_robot_head_2 (shadows)': [[59, 159, 89]], 'stretch_robot_wheel_r': [[225, 234, 123]], 'agent_collarbone_3': [[77, 220, 150]], 'stretch_robot_wheel_l': [[141, 244, 80]], 'agent_collarbone_3 (shadows)': [[244, 32, 83]], 'agent_neck_1 (shadows)': [[153, 228, 139]], 'Box': [[52, 209, 176], [60, 252, 230]], 'agent_body_1': [[244, 3, 18]], 'stretch_robot_base': [[228, 20, 44]], 'robot_roll_link_n (shadows)': [[89, 184, 7]], 'agent_neck_2': [[189, 103, 136]], 'agent_collarbone_1': [[74, 214, 33]], 'Sofa': [[173, 247, 203], [82, 143, 39]], 'stretch_robot_head_3': [[22, 163, 224]], 'Drawer': [[2, 114, 123], [155, 30, 210]], 'SideTable': [[242, 161, 247], [202, 45, 114]], 'robot_camera_link_n (shadows)': [[210, 68, 193]]}}

{'instance_masks': <ai2thor.server.LazyInstanceSegmentationMasks object at 0x7f6a0513d490>, '_detections2d': {}}
{'_masks': {}, '_loaded': False, '_unique_integer_keys': None, '_empty_mask': None, 'instance_segmentation_frame_uint32': array([[4285920071, 4285920071, 4285920071, ..., 4285920071, 4285920071,
        4285920071],
       [4285920071, 4285920071, 4285920071, ..., 4285920071, 4285920071,
        4285920071],
       [4285920071, 4285920071, 4285920071, ..., 4285920071, 4285920071,
        4285920071],
       ...,
       [4280252054, 4280252054, 4280252054, ..., 4291557293, 4287294686,
        4287294686],
       [4280252054, 4280252054, 4280252054, ..., 4291557293, 4287294686,
        4287294686],
       [4280252054, 4280252054, 4280252054, ..., 4291557293, 4287294686,
        4287294686]], dtype=uint32), '_alpha_channel_value': 255, 'instance_colors': {'Window|+00.02|+02.07|+02.49': [116, 71, 187], 'Window|+01.57|+02.07|+02.49': [212, 67, 47], 'Wall|-1|0|-1.05': [130, 142, 193], 'FP212:StandardDoorFrame1  1|-1.45|0|1.2': [211, 123, 84], 'FP212:StandardDoorFrame1 |-4.25|0|-0.75': [222, 187, 121], 'FP212:StandardDoorFrame1|-1.85|0|-1.45': [158, 12, 253], 'FP212:StandardKnob1  1|-1.45|0|1.2': [244, 121, 100], 'FP212:StandardDoor1  1|-1.45|1|0.775': [147, 159, 70], 'FP212:StandardKnob1 |-4.25|0|-0.75': [91, 85, 186], 'FP212:StandardDoor1 |-4.25|1|-1.175': [30, 25, 50], 'FP212:StandardKnob1|-1.85|0|-1.45': [131, 46, 179], 'FP212:StandardDoor1|-1.425|1|-1.45': [111, 225, 149], 'FP212:LightFixture1  5|0.008|3.6|-0.436': [41, 40, 77], 'FP212:LightFixture1  4|1.008|3.6|-0.436': [74, 169, 46], 'FP212:LightFixture1  3|2.008|3.6|-0.436': [213, 47, 127], 'FP212:LightFixture1  2|2.481|3.6|-0.436': [75, 26, 228], 'FP212:LightFixture1  1|1.481|3.6|-0.436': [165, 20, 37], 'FP212:LightFixture1 |0.481|3.6|-0.436': [18, 224, 254], 'Ceiling|0|3.8|0': [71, 243, 117], 'FP212:Fireplace|0|0|0': [17, 237, 232], 'Floor|+00.00|+00.00|+00.00': [88, 131, 103], 'Painting|+04.07|+01.95|+00.85': [172, 246, 191], 'Television|+01.90|+01.28|-00.84': [219, 211, 179], 'Laptop|+01.80|+00.47|+00.50': [62, 216, 95], 'FloorLamp|+03.61|+00.00|+02.16': [141, 119, 166], 'TissueBox|+03.92|+00.87|+00.68': [184, 67, 215], 'LightSwitch|-01.40|+01.29|+01.84': [194, 176, 88], 'RemoteControl|+01.88|+00.33|+01.73': [165, 165, 22], 'HousePlant|+00.39|+00.80|-00.73': [195, 50, 9], 'Newspaper|+02.15|+00.41|-00.72': [254, 186, 161], 'Boots|+04.00|+00.00|+01.70': [225, 185, 33], 'WateringCan|+01.62|+00.02|-00.70': [111, 91, 236], 'KeyChain|+01.50|+00.47|+00.53': [208, 10, 20], 'CreditCard|+01.41|+00.47|+00.65': [60, 92, 152], 'Pen|+03.93|+00.87|+01.04': [210, 232, 241], 'GarbageCan|+03.83|-00.03|-00.50': [145, 121, 57], 'Pencil|+03.89|+00.87|+01.18': [27, 124, 5], 'Pillow|+00.65|+00.39|+01.71': [80, 74, 115], 'ArmChair|+02.66|+00.00|+01.86': [222, 236, 138], 'CoffeeTable|+01.59|00.00|+00.45': [105, 204, 158], 'Statue|-00.54|+00.40|-00.69': [154, 183, 247], 'Statue|-00.09|+00.03|-00.70': [202, 142, 60], 'TVStand|-00.29|00.00|-00.77': [140, 73, 49], 'Shelf|+01.91|+00.20|-00.73': [186, 61, 199], 'Shelf|+01.91|+00.59|-00.73': [68, 73, 158], 'Shelf|-00.29|+00.59|-00.73': [193, 106, 55], 'TVStand|+01.90|00.00|-00.77': [183, 35, 100], 'Shelf|-00.29|+00.20|-00.73': [88, 120, 178], 'ArmChair|-00.27|+00.00|+01.87': [150, 118, 31], 'Box|-00.47|+01.04|-00.71': [52, 209, 176], 'Sofa|+01.19|+00.01|+01.87': [173, 247, 203], 'Drawer|+03.88|+00.77|+00.86': [2, 114, 123], 'SideTable|+03.95|+00.00|+00.86': [242, 161, 247]}, 'class_colors': {'Window': [[116, 71, 187], [200, 150, 134], [212, 67, 47]], 'Wall': [[130, 142, 193], [148, 232, 164]], 'FP212:StandardDoorFrame1  1': [[211, 123, 84]], 'Door': [[244, 78, 20]], 'FP212:StandardDoorFrame1 ': [[222, 187, 121]], 'FP212:StandardDoorFrame1': [[158, 12, 253]], 'FP212:StandardKnob1  1': [[244, 121, 100]], 'FP212:StandardDoor1  1': [[147, 159, 70]], 'FP212:StandardKnob1 ': [[91, 85, 186]], 'FP212:StandardDoor1 ': [[30, 25, 50]], 'FP212:StandardKnob1': [[131, 46, 179]], 'FP212:StandardDoor1': [[111, 225, 149]], 'FP212:LightFixture1  5': [[41, 40, 77]], 'CeilingLight': [[230, 113, 120]], 'FP212:LightFixture1  4': [[74, 169, 46]], 'FP212:LightFixture1  3': [[213, 47, 127]], 'FP212:LightFixture1  2': [[75, 26, 228]], 'FP212:LightFixture1  1': [[165, 20, 37]], 'FP212:LightFixture1 ': [[18, 224, 254]], 'Ceiling': [[71, 243, 117], [50, 55, 251]], 'FP212:Fireplace': [[17, 237, 232]], 'FirePlace': [[250, 76, 248]], 'Floor': [[88, 131, 103], [243, 246, 208]], 'Painting': [[172, 246, 191], [40, 117, 236]], 'Television': [[219, 211, 179], [27, 245, 217]], 'Laptop': [[62, 216, 95], [20, 107, 222]], 'FloorLamp': [[141, 119, 166], [253, 73, 35]], 'TissueBox': [[184, 67, 215], [98, 43, 249]], 'LightSwitch': [[194, 176, 88], [11, 51, 121]], 'RemoteControl': [[165, 165, 22], [187, 19, 208]], 'HousePlant': [[195, 50, 9], [73, 144, 213]], 'TheHand': [[146, 100, 153]], 'Newspaper': [[254, 186, 161], [19, 196, 2]], 'Boots': [[225, 185, 33], [121, 126, 101]], 'WateringCan': [[111, 91, 236], [147, 67, 249]], 'KeyChain': [[208, 10, 20], [27, 54, 18]], 'CreditCard': [[60, 92, 152], [56, 235, 12]], 'Pen': [[210, 232, 241], [239, 130, 152]], 'GarbageCan': [[145, 121, 57], [225, 40, 55]], 'Pencil': [[27, 124, 5], [177, 226, 23]], 'Pillow': [[80, 74, 115], [217, 193, 130]], 'agent_neck_1': [[16, 24, 176]], 'agent_body_1 (shadows)': [[233, 94, 109]], 'agent_collarbone_1 (shadows)': [[222, 46, 143]], 'agent_head_1 (shadows)': [[39, 201, 198]], 'stretch_robot_base_code': [[22, 130, 39]], 'ArmChair': [[222, 236, 138], [96, 52, 68], [150, 118, 31]], 'agent_head_1': [[122, 92, 42]], 'stretch_robot_laser': [[7, 65, 186]], 'stretch_robot_mast': [[150, 75, 198]], 'stretch_robot_head_1': [[217, 90, 117]], 'agent_neck_3': [[60, 83, 243]], 'robot_tilt_link_n (shadows)': [[25, 193, 224]], 'CoffeeTable': [[105, 204, 158], [18, 14, 75]], 'agent_head_2 (shadows)': [[96, 154, 25]], 'stretch_robot_head_3 (shadows)': [[223, 182, 152]], 'Statue': [[154, 183, 247], [243, 75, 41], [202, 142, 60]], 'agent_collarbone_2': [[233, 207, 130]], 'agent_neck_2 (shadows)': [[229, 81, 183]], 'agent_head_2': [[209, 89, 45]], 'stretch_robot_head_2': [[234, 200, 134]], 'agent_neck_3 (shadows)': [[76, 55, 128]], 'stretch_robot_omniwheel': [[31, 254, 187]], 'robot_elbow_link': [[101, 147, 231]], 'robot_camera_link_n': [[139, 142, 98]], 'robot_finger_r': [[19, 64, 240]], 'robot_finger_l': [[216, 72, 223]], 'robot_ar_tag': [[57, 112, 32]], 'robot_cam_mount_n': [[219, 204, 228]], 'robot_battery_n': [[4, 101, 154]], 'robot_plate': [[89, 60, 102]], 'robot_main_wheel_r_n': [[30, 143, 133]], 'robot_shoulder_link': [[76, 25, 229]], 'robot_roll_link_n': [[69, 33, 38]], 'robot_gripper_link': [[31, 177, 195]], 'robot_forearm_link': [[163, 79, 72]], 'robot_main_wheel_l_n': [[62, 33, 100]], 'robot_main_body': [[22, 91, 128]], 'robot_wrist_link': [[26, 108, 12]], 'robot_tilt_link_n': [[224, 90, 45]], 'agent_collarbone_2 (shadows)': [[1, 149, 227]], 'TVStand': [[140, 73, 49], [94, 234, 136], [183, 35, 100]], 'Shelf': [[186, 61, 199], [39, 54, 158], [68, 73, 158], [193, 106, 55], [88, 120, 178]], 'stretch_robot_head_2 (shadows)': [[59, 159, 89]], 'stretch_robot_wheel_r': [[225, 234, 123]], 'agent_collarbone_3': [[77, 220, 150]], 'stretch_robot_wheel_l': [[141, 244, 80]], 'agent_collarbone_3 (shadows)': [[244, 32, 83]], 'agent_neck_1 (shadows)': [[153, 228, 139]], 'Box': [[52, 209, 176], [60, 252, 230]], 'agent_body_1': [[244, 3, 18]], 'stretch_robot_base': [[228, 20, 44]], 'robot_roll_link_n (shadows)': [[89, 184, 7]], 'agent_neck_2': [[189, 103, 136]], 'agent_collarbone_1': [[74, 214, 33]], 'Sofa': [[173, 247, 203], [82, 143, 39]], 'stretch_robot_head_3': [[22, 163, 224]], 'Drawer': [[2, 114, 123], [155, 30, 210]], 'SideTable': [[242, 161, 247], [202, 45, 114]], 'robot_camera_link_n (shadows)': [[210, 68, 193]]}}

{'instance_masks': <ai2thor.server.LazyInstanceSegmentationMasks object at 0x7f6a0513d610>, '_detections2d': {}}
{'_masks': {}, '_loaded': False, '_unique_integer_keys': None, '_empty_mask': None, 'instance_segmentation_frame_uint32': array([[4290875010, 4290875010, 4290875010, ..., 4285920071, 4285920071,
        4285920071],
       [4290875010, 4290875010, 4290875010, ..., 4285920071, 4285920071,
        4285920071],
       [4290875010, 4290875010, 4290875010, ..., 4285920071, 4285920071,
        4285920071],
       ...,
       [4281420172, 4281420172, 4281420172, ..., 4284973912, 4284973912,
        4284973912],
       [4281420172, 4281420172, 4281420172, ..., 4284973912, 4284973912,
        4284973912],
       [4281420172, 4281420172, 4281420172, ..., 4284973912, 4284973912,
        4284973912]], dtype=uint32), '_alpha_channel_value': 255, 'instance_colors': {'Window|+00.02|+02.07|+02.49': [116, 71, 187], 'Window|+01.57|+02.07|+02.49': [212, 67, 47], 'Wall|-1|0|-1.05': [130, 142, 193], 'FP212:StandardDoorFrame1  1|-1.45|0|1.2': [211, 123, 84], 'FP212:StandardDoorFrame1 |-4.25|0|-0.75': [222, 187, 121], 'FP212:StandardDoorFrame1|-1.85|0|-1.45': [158, 12, 253], 'FP212:StandardKnob1  1|-1.45|0|1.2': [244, 121, 100], 'FP212:StandardDoor1  1|-1.45|1|0.775': [147, 159, 70], 'FP212:StandardKnob1 |-4.25|0|-0.75': [91, 85, 186], 'FP212:StandardDoor1 |-4.25|1|-1.175': [30, 25, 50], 'FP212:StandardKnob1|-1.85|0|-1.45': [131, 46, 179], 'FP212:StandardDoor1|-1.425|1|-1.45': [111, 225, 149], 'FP212:LightFixture1  5|0.008|3.6|-0.436': [41, 40, 77], 'FP212:LightFixture1  4|1.008|3.6|-0.436': [74, 169, 46], 'FP212:LightFixture1  3|2.008|3.6|-0.436': [213, 47, 127], 'FP212:LightFixture1  2|2.481|3.6|-0.436': [75, 26, 228], 'FP212:LightFixture1  1|1.481|3.6|-0.436': [165, 20, 37], 'FP212:LightFixture1 |0.481|3.6|-0.436': [18, 224, 254], 'Ceiling|0|3.8|0': [71, 243, 117], 'FP212:Fireplace|0|0|0': [17, 237, 232], 'Floor|+00.00|+00.00|+00.00': [88, 131, 103], 'Painting|+04.07|+01.95|+00.85': [172, 246, 191], 'Television|+01.90|+01.28|-00.84': [219, 211, 179], 'Laptop|+01.80|+00.47|+00.50': [62, 216, 95], 'FloorLamp|+03.61|+00.00|+02.16': [141, 119, 166], 'TissueBox|+03.92|+00.87|+00.68': [184, 67, 215], 'LightSwitch|-01.40|+01.29|+01.84': [194, 176, 88], 'RemoteControl|+01.88|+00.33|+01.73': [165, 165, 22], 'HousePlant|+00.39|+00.80|-00.73': [195, 50, 9], 'Newspaper|+02.15|+00.41|-00.72': [254, 186, 161], 'Boots|+04.00|+00.00|+01.70': [225, 185, 33], 'WateringCan|+01.62|+00.02|-00.70': [111, 91, 236], 'KeyChain|+01.50|+00.47|+00.53': [208, 10, 20], 'CreditCard|+01.41|+00.47|+00.65': [60, 92, 152], 'Pen|+03.93|+00.87|+01.04': [210, 232, 241], 'GarbageCan|+03.83|-00.03|-00.50': [145, 121, 57], 'Pencil|+03.89|+00.87|+01.18': [27, 124, 5], 'Pillow|+00.65|+00.39|+01.71': [80, 74, 115], 'ArmChair|+02.66|+00.00|+01.86': [222, 236, 138], 'CoffeeTable|+01.59|00.00|+00.45': [105, 204, 158], 'Statue|-00.54|+00.40|-00.69': [154, 183, 247], 'Statue|-00.09|+00.03|-00.70': [202, 142, 60], 'TVStand|-00.29|00.00|-00.77': [140, 73, 49], 'Shelf|+01.91|+00.20|-00.73': [186, 61, 199], 'Shelf|+01.91|+00.59|-00.73': [68, 73, 158], 'Shelf|-00.29|+00.59|-00.73': [193, 106, 55], 'TVStand|+01.90|00.00|-00.77': [183, 35, 100], 'Shelf|-00.29|+00.20|-00.73': [88, 120, 178], 'ArmChair|-00.27|+00.00|+01.87': [150, 118, 31], 'Box|-00.47|+01.04|-00.71': [52, 209, 176], 'Sofa|+01.19|+00.01|+01.87': [173, 247, 203], 'Drawer|+03.88|+00.77|+00.86': [2, 114, 123], 'SideTable|+03.95|+00.00|+00.86': [242, 161, 247]}, 'class_colors': {'Window': [[116, 71, 187], [200, 150, 134], [212, 67, 47]], 'Wall': [[130, 142, 193], [148, 232, 164]], 'FP212:StandardDoorFrame1  1': [[211, 123, 84]], 'Door': [[244, 78, 20]], 'FP212:StandardDoorFrame1 ': [[222, 187, 121]], 'FP212:StandardDoorFrame1': [[158, 12, 253]], 'FP212:StandardKnob1  1': [[244, 121, 100]], 'FP212:StandardDoor1  1': [[147, 159, 70]], 'FP212:StandardKnob1 ': [[91, 85, 186]], 'FP212:StandardDoor1 ': [[30, 25, 50]], 'FP212:StandardKnob1': [[131, 46, 179]], 'FP212:StandardDoor1': [[111, 225, 149]], 'FP212:LightFixture1  5': [[41, 40, 77]], 'CeilingLight': [[230, 113, 120]], 'FP212:LightFixture1  4': [[74, 169, 46]], 'FP212:LightFixture1  3': [[213, 47, 127]], 'FP212:LightFixture1  2': [[75, 26, 228]], 'FP212:LightFixture1  1': [[165, 20, 37]], 'FP212:LightFixture1 ': [[18, 224, 254]], 'Ceiling': [[71, 243, 117], [50, 55, 251]], 'FP212:Fireplace': [[17, 237, 232]], 'FirePlace': [[250, 76, 248]], 'Floor': [[88, 131, 103], [243, 246, 208]], 'Painting': [[172, 246, 191], [40, 117, 236]], 'Television': [[219, 211, 179], [27, 245, 217]], 'Laptop': [[62, 216, 95], [20, 107, 222]], 'FloorLamp': [[141, 119, 166], [253, 73, 35]], 'TissueBox': [[184, 67, 215], [98, 43, 249]], 'LightSwitch': [[194, 176, 88], [11, 51, 121]], 'RemoteControl': [[165, 165, 22], [187, 19, 208]], 'HousePlant': [[195, 50, 9], [73, 144, 213]], 'TheHand': [[146, 100, 153]], 'Newspaper': [[254, 186, 161], [19, 196, 2]], 'Boots': [[225, 185, 33], [121, 126, 101]], 'WateringCan': [[111, 91, 236], [147, 67, 249]], 'KeyChain': [[208, 10, 20], [27, 54, 18]], 'CreditCard': [[60, 92, 152], [56, 235, 12]], 'Pen': [[210, 232, 241], [239, 130, 152]], 'GarbageCan': [[145, 121, 57], [225, 40, 55]], 'Pencil': [[27, 124, 5], [177, 226, 23]], 'Pillow': [[80, 74, 115], [217, 193, 130]], 'agent_neck_1': [[16, 24, 176]], 'agent_body_1 (shadows)': [[233, 94, 109]], 'agent_collarbone_1 (shadows)': [[222, 46, 143]], 'agent_head_1 (shadows)': [[39, 201, 198]], 'stretch_robot_base_code': [[22, 130, 39]], 'ArmChair': [[222, 236, 138], [96, 52, 68], [150, 118, 31]], 'agent_head_1': [[122, 92, 42]], 'stretch_robot_laser': [[7, 65, 186]], 'stretch_robot_mast': [[150, 75, 198]], 'stretch_robot_head_1': [[217, 90, 117]], 'agent_neck_3': [[60, 83, 243]], 'robot_tilt_link_n (shadows)': [[25, 193, 224]], 'CoffeeTable': [[105, 204, 158], [18, 14, 75]], 'agent_head_2 (shadows)': [[96, 154, 25]], 'stretch_robot_head_3 (shadows)': [[223, 182, 152]], 'Statue': [[154, 183, 247], [243, 75, 41], [202, 142, 60]], 'agent_collarbone_2': [[233, 207, 130]], 'agent_neck_2 (shadows)': [[229, 81, 183]], 'agent_head_2': [[209, 89, 45]], 'stretch_robot_head_2': [[234, 200, 134]], 'agent_neck_3 (shadows)': [[76, 55, 128]], 'stretch_robot_omniwheel': [[31, 254, 187]], 'robot_elbow_link': [[101, 147, 231]], 'robot_camera_link_n': [[139, 142, 98]], 'robot_finger_r': [[19, 64, 240]], 'robot_finger_l': [[216, 72, 223]], 'robot_ar_tag': [[57, 112, 32]], 'robot_cam_mount_n': [[219, 204, 228]], 'robot_battery_n': [[4, 101, 154]], 'robot_plate': [[89, 60, 102]], 'robot_main_wheel_r_n': [[30, 143, 133]], 'robot_shoulder_link': [[76, 25, 229]], 'robot_roll_link_n': [[69, 33, 38]], 'robot_gripper_link': [[31, 177, 195]], 'robot_forearm_link': [[163, 79, 72]], 'robot_main_wheel_l_n': [[62, 33, 100]], 'robot_main_body': [[22, 91, 128]], 'robot_wrist_link': [[26, 108, 12]], 'robot_tilt_link_n': [[224, 90, 45]], 'agent_collarbone_2 (shadows)': [[1, 149, 227]], 'TVStand': [[140, 73, 49], [94, 234, 136], [183, 35, 100]], 'Shelf': [[186, 61, 199], [39, 54, 158], [68, 73, 158], [193, 106, 55], [88, 120, 178]], 'stretch_robot_head_2 (shadows)': [[59, 159, 89]], 'stretch_robot_wheel_r': [[225, 234, 123]], 'agent_collarbone_3': [[77, 220, 150]], 'stretch_robot_wheel_l': [[141, 244, 80]], 'agent_collarbone_3 (shadows)': [[244, 32, 83]], 'agent_neck_1 (shadows)': [[153, 228, 139]], 'Box': [[52, 209, 176], [60, 252, 230]], 'agent_body_1': [[244, 3, 18]], 'stretch_robot_base': [[228, 20, 44]], 'robot_roll_link_n (shadows)': [[89, 184, 7]], 'agent_neck_2': [[189, 103, 136]], 'agent_collarbone_1': [[74, 214, 33]], 'Sofa': [[173, 247, 203], [82, 143, 39]], 'stretch_robot_head_3': [[22, 163, 224]], 'Drawer': [[2, 114, 123], [155, 30, 210]], 'SideTable': [[242, 161, 247], [202, 45, 114]], 'robot_camera_link_n (shadows)': [[210, 68, 193]]}}

pioneer-innovation commented 2 years ago

Hi @mattdeitke , Could you please help me? T-T

mattdeitke commented 2 years ago

Hey @pioneer-innovation,

good question! It looks like the API updated and I wasn't aware of this. Looking into it now :)

pioneer-innovation commented 2 years ago

Thank you！I am looking forward to use procthor once this question is done. That is a fantastic embodied AI platform !

mattdeitke commented 2 years ago

I'm going to update the documentation with this.

Basically there was a change to how the instance masks are rendered, such that the bounding box is only generated when it's requested, and not every time. This is done to speed up the FPS when using instance segmentation, since most users don't often utilize all the bounding boxes when instance segmentation is on.

Here's what I've found:

You can get the list of objects that have a 2D bounding box in the current frame with:

list(event.instance_detections2D.instance_masks.keys())

this returns something like:

['Cabinet|+00.95|+02.16|-02.38',
 'StoveBurner|+01.08|+00.92|-01.50',
 'Cabinet|+00.95|+02.16|-00.76',
 'StandardWallSize|1|0|2',
 'Cabinet|+00.95|+02.44|-01.78',
 'StoveBurner|+00.84|+00.92|-01.10',
 'StoveBurner|+00.84|+00.92|-01.50',
 'StoveBurner|+01.08|+00.92|-01.10',
 'StandardCounterHeightWidth|0.98|0|0.18',
 'StandardUpperCabinetHeightWidth|1.28|0|0.18',
 'StandardWallTileHeight1|1.3|0|0.18',
 'StoveBase1|0.997|0|-1.302',
 'StoveTopGas|-1.503001|0|-1.06545',
 'Pan|+00.85|+00.95|-01.08',
 'SaltShaker|+01.19|+00.90|-01.80',
 'Microwave|+01.04|+01.68|-01.30',
 'Cup|+01.08|+00.90|-00.77',
 'StoveKnob|+00.67|+00.90|-01.24',
 'StoveKnob|+00.67|+00.90|-01.09',
 'StoveKnob|+00.67|+00.90|-01.52',
 'StoveKnob|+00.67|+00.90|-01.37',
 'CoffeeMachine|+00.89|+00.90|-02.13',
 'PepperShaker|+01.09|+00.90|-01.82',
 'Spatula|+01.10|+00.91|-00.63',
 'PaperTowelRoll|+01.22|+01.01|-00.52',
 'CounterTop|+00.93|+00.95|-02.05',
 'CounterTop|+00.93|+00.95|-00.21']

then indexing into the instance_detections2D with any of these object IDs, we get something like:

event.instance_detections2D["Cabinet|+00.95|+02.16|-02.38"]

which returns

(237, 0, 299, 141)

corresponding to the [Upper Left x, Upper Left y, Lower Right x, Lower Right y] bound of the image, which can be plotted as follows:

Here is a link to a Colab notebook to reproduce: https://colab.research.google.com/drive/1Matvn6yqDdBld3MEv_hSFP67f6aN1dR7?usp=sharing

Hope that helps, let me know if you have any other questions :)

pioneer-innovation commented 2 years ago

Thank you ! It works now !

pioneer-innovation commented 2 years ago

By the way, it can not work in the ProcTHOR. The keys are wrong, but the boxes are correct. I clearly followed your code.

list(event.instance_detections2D.instance_masks.keys())

It returns:

['door|1|3', 'door|2|3', '3|5', '2|6', '3|2', '3|0|2', '2|1', 'Ceiling_room|2|0|2.61353|0', 'wall|3|15.78|5.26|15.78|8.77', 'wall|3|10.52|8.77|15.78|8.77', 'wall|2|15.78|1.75|15.78|5.26', 'wall|2|8.77|5.26|15.78|5.26', 'room|3', 'room|2']

Only 'door', 'Ceiling_room', and 'wall' can be correctly output. However, using the wrong keys (such as '3|5' and '3|0|2') can obtain the correct 2D boxes.

Here is my whole code:

from ai2thor.controller import Controller
import pickle
import cv2
# load data
f = open("/home/casia/dataset/procthor/houses.pkl", "rb")
dataset = pickle.load(f)
train_dataset = dataset["train"]
train_house_num = len(dataset["train"])
# per house
for i in range(1,train_house_num):
    # init house
    train_house = train_dataset[i]
    controller = Controller(branch="nanna",
                            scene="Procedural",
                            renderInstanceSegmentation=True,
                            renderObjectImage=True)
    controller.step(action="CreateHouse", house=train_house)
    controller.step(action="TeleportFull", **train_house["metadata"]["agent"])
    # get all possible position
    event = controller.step(action="GetReachablePositions")
    positions = event.metadata["actionReturn"]
    rotations = [0, 45, 90, 135, 180, 225, 270, 315]
    horizons = [0, 30]
    # per position
    for position in positions:
        for rotation in rotations:
            for horizon in horizons:
                # get keys
                event = controller.step(action="Teleport", position=position, rotation=rotation, horizon=horizon)
                objects = list(event.instance_detections2D.instance_masks.keys())
                img = event.cv2img.copy()
                # draw boxes
                bboxes = {}
                for object in objects:
                    class_name = object.split('|')[0]
                    bbox = event.instance_detections2D[object]
                    bboxes[class_name] = list(bbox)
                    img = cv2.rectangle(img, (bbox[0], bbox[1]), (bbox[2], bbox[3]), (255, 0, 0), 2)
                # display
                cv2.namedWindow("AI2THOR", cv2.WINDOW_NORMAL)
                cv2.imshow("AI2THOR", img)
                key = cv2.waitKey(2000)

mattdeitke commented 2 years ago

The keys appear to be correct to me.

ProcTHOR objects don't follow the same objectId pattern, they in general just have to appear unique. So the object that "2|6" corresponds to can be found with:

next(obj for obj in event.metadata["objects"] if obj["objectId"] == "2|6")

pioneer-innovation commented 2 years ago

OK, I see. I can find object type in metadata. Thank you! PS: The process of finding the object in metadata seems a little cumbersome. The event.metadata["objects"] is list and I need to use the following code to traverse the list to query.

for obj in event.metadata["objects"]:
    if obj["objectId"] == objectID:
        class_name = obj["objectType"]

mattdeitke commented 2 years ago

That's a good point! I agree that it makes debugging harder.

Let me see if anybody would be against me pretending each object ID with the object type in the ProcTHOR-10K house jsons. It shouldn't really change anything unless somebody hard-coded the object ids for some reason, but it seems unlike.

mattdeitke commented 2 years ago

Hi @pioneer-innovation,

I've taken your suggestion into account and updated all the objectIds in ProcTHOR-10K! :)

Each objectId is now prepended with its object type. Take a look: https://colab.research.google.com/drive/1aoBvg6KqBZgUT2buNOUmGQA9wjdx3F3F?usp=sharing

Note, we have also updated the distribution of ProcTHOR-10K to now use the prior package, which points to the procthor-10k repo. This makes it much easier to download, version, and use the dataset in projects, by simply installing:

pip install prior

and running:

import prior
dataset = prior.load_dataset("procthor-10k")

pioneer-innovation commented 2 years ago

That is fantastic ! Thanks for your wonderful work !

mariiak2021 commented 9 months ago

Hi @pioneer-innovation and @mattdeitke,

I'm currently experiencing the problem with object IDs within instance_detections2D, using houses from procTHOR generated DS. The problem is that the keys of objects in instancedetections2D do not include children objects but only general house structures like ['window|2|0', 'door|1|2', 'Ceilingroom|2|0|2.504048|0', 'wall|2|6.86|0.00|6.86|3.43', 'wall|2|0.00|3.43|6.86|3.43', 'room|2']

Can you please suggest which version of Ai2Thor, and commits of procthor should I use to solve this problem?

allenai / ai2thor

instance_detections2D #1036