askforalfred / alfred

ALFRED - A Benchmark for Interpreting Grounded Instructions for Everyday Tasks
MIT License
352 stars 77 forks source link

What is the best way to get a GT mask? #134

Closed TopCoder2K closed 1 year ago

TopCoder2K commented 1 year ago

Do I understand correctly that env.last_event.class_masks contains gt masks for all objects present in self.env.last_event.frame? How to pick the desired gt mask? Is that correct:

class2mask = self.env.last_event.class_masks
mask = class2mask[target_obj]

where target_obj is the name of the target class (i.e. without any additional info like coordinates and so on, e.g. 'KeyChain')?

MohitShridhar commented 1 year ago

@TopCoder2K check out the augment_trajectories.py guide. GT masks are saved here.

TopCoder2K commented 1 year ago

@MohitShridhar, thank you for the links! But are you sure this is the best way of obtaining masks when you only have a class name (I'm trying to build a GT segmentator that returns a GT mask by class name)?

I've seen the use of last_event.instance_segmentation_frame and last_event.color_to_object_id inside the ThorEnv.va_interact() function but it looks like a different scenario. Also, I've noticed that there are keys like 'CellPhone|+02.71|+00.80|-01.48' and 'CellPhone' inside last_event.object_id_to_color, so I'm not sure which one I should choose when I only have 'CellPhone' as a target class. Or can I choose any key containing 'CellPhone'?

TopCoder2K commented 1 year ago

Interesting... For this frame 193 env.last_event.class_masks keys are

['Cube.175', 'Cabinet', 'OVENDOORBOTTOM.002', 'Drawer', 'Knife', 'CounterTop', 'Spatula', 'Sink', 'Cup', 'Apple', 'Bottle', 'Sphere.014', 'Jar', 'Bag', 'Faucet', 'Cube.003', 'Ladle']

so, there is no 'AppleSliced' while env.last_event.color_to_object_id has four objects containing 'AppleSliced' in their names:

(4, 13, 230): 'Apple|-01.27|+00.85|+00.23|AppleSliced_2'
(93, 226, 181): 'AppleSliced'
(95, 55, 227): 'Apple|-01.27|+00.85|+00.23|AppleSliced_1'
(245, 235, 43): 'Apple|-01.27|+00.85|+00.23|AppleSliced_0'

Thus, the second question above remains: can I choose any object containing 'AppleSliced' and be sure that the interaction will be successful (in case the target object close enough)?

UPD1 from 02/08/23 The problem with env.last_event.color_to_object_id is that it seems to contain all possible objects for the scene. So, firstly, I need to find all the colors present in the current frame, then choose the target color by name and only after that get the gt mask. It looks quite complicated, can I simplify getting the target mask? UPD2 from 02/08/23 Hmm, I've noticed env.last_event.instance_masks. It looks like this is what I need. UPD3 from 02/08/23 Oh, sorry @MohitShridhar, I hadn't thought that there was a valid documentation for this (although it's for the latest AI2THOR version, the behaviour seems to be the same in 2.1.0). So, env.last_event.instance_masks is really what I need.

TopCoder2K commented 1 year ago

I also do not understand how interaction works. Looking at ThorEnv.va_interact(), one may think that the correct mask is enough for successful interaction, but I've encountered the following cases: 157 gt_mask_157

Action is OpenObject, Success is False, Object failed to open/close successfully.

155 gt_mask_155

Action is PutObject, Success is False, No valid positions to place object found.

I could explain the second case with something like 'Half of the object is not seen, why do you expect the interaction to be successful?', but why was the fridge not opened? I guarantee that it is visible (distance is <= 1.5m).

TopCoder2K commented 1 year ago

It's the same puzzlement here: 41 gt_mask_41

Action is PutObject, Success is False, No valid positions to place object found.

although the object is visible. Why has it not been successful?

thomason-jesse commented 1 year ago

The error message from AI2THOR details the reason: "No valid positions to place object found." This version of THOR samples potential locations for the held object against the receptacle and fails if it iterates through them and finds that they all cause collisions. It can be resolved by trying again from a different position (sometimes) or by clearing off whatever is already on the receptacle.

Similarly the error "Object failed to open/close successfully." is the reason for the fridge open/close. This error is most common when the agent is standing close enough that the fridge door would collide with it or with its held object on the path to opening.

In both cases, these are simulator quirks at the AI2THOR level, and the error messages you copied over are the explanations. These aren't ALFRED-specific problems.

TopCoder2K commented 1 year ago

@thomason-jesse, thank you for the explanations! They are really valuable because I've seen the error messages, but the reason behind them remained unclear. In addition, it's quite surprising to me that there are cases similar to what happened with the "ArmChair". I thought AI2THOR didn't have any quirks...