askforalfred / alfred

ALFRED - A Benchmark for Interpreting Grounded Instructions for Everyday Tasks
MIT License
360 stars 77 forks source link

Why do you use ground-truth instance segmentation mask in va_interact? #70

Closed soyeonm closed 3 years ago

soyeonm commented 3 years ago

Hello,

I have a question. In line 501 of va_interact (in thor_env.py), you call the ground-truth segmentation of the view. Is this to obtain the exact 'object_id' (e.g. 'Fridge|-00.32|00.00|+03.60' ) of the object to interact with? (Since the predicted "mask" is only binary? And also since getting the category only (e.g. 'Fridge') is insufficient?)

Thanks!

MohitShridhar commented 3 years ago

@soyeonm this function implements IoU-based object selection: image

Is this to obtain the exact 'object_id' (e.g. 'Fridge|-00.32|00.00|+03.60' ) of the object to interact with?

Yes, you need the exact object_id because there could be multiple instances of the same object, e.g. apple1, apple2, apple3 etc. all in the same frame. So you have to be specific about which apple to pickup via the pixelwise mask.

MohitShridhar commented 3 years ago

Closing due to inactivity.