Open pioneer-innovation opened 2 years ago
Hi @mattdeitke , Could you please help me? T-T
Hey @pioneer-innovation,
good question! It looks like the API updated and I wasn't aware of this. Looking into it now :)
Thank you!I am looking forward to use procthor once this question is done. That is a fantastic embodied AI platform !
I'm going to update the documentation with this.
Basically there was a change to how the instance masks are rendered, such that the bounding box is only generated when it's requested, and not every time. This is done to speed up the FPS when using instance segmentation, since most users don't often utilize all the bounding boxes when instance segmentation is on.
Here's what I've found:
You can get the list of objects that have a 2D bounding box in the current frame with:
list(event.instance_detections2D.instance_masks.keys())
this returns something like:
['Cabinet|+00.95|+02.16|-02.38',
'StoveBurner|+01.08|+00.92|-01.50',
'Cabinet|+00.95|+02.16|-00.76',
'StandardWallSize|1|0|2',
'Cabinet|+00.95|+02.44|-01.78',
'StoveBurner|+00.84|+00.92|-01.10',
'StoveBurner|+00.84|+00.92|-01.50',
'StoveBurner|+01.08|+00.92|-01.10',
'StandardCounterHeightWidth|0.98|0|0.18',
'StandardUpperCabinetHeightWidth|1.28|0|0.18',
'StandardWallTileHeight1|1.3|0|0.18',
'StoveBase1|0.997|0|-1.302',
'StoveTopGas|-1.503001|0|-1.06545',
'Pan|+00.85|+00.95|-01.08',
'SaltShaker|+01.19|+00.90|-01.80',
'Microwave|+01.04|+01.68|-01.30',
'Cup|+01.08|+00.90|-00.77',
'StoveKnob|+00.67|+00.90|-01.24',
'StoveKnob|+00.67|+00.90|-01.09',
'StoveKnob|+00.67|+00.90|-01.52',
'StoveKnob|+00.67|+00.90|-01.37',
'CoffeeMachine|+00.89|+00.90|-02.13',
'PepperShaker|+01.09|+00.90|-01.82',
'Spatula|+01.10|+00.91|-00.63',
'PaperTowelRoll|+01.22|+01.01|-00.52',
'CounterTop|+00.93|+00.95|-02.05',
'CounterTop|+00.93|+00.95|-00.21']
then indexing into the instance_detections2D with any of these object IDs, we get something like:
event.instance_detections2D["Cabinet|+00.95|+02.16|-02.38"]
which returns
(237, 0, 299, 141)
corresponding to the [Upper Left x, Upper Left y, Lower Right x, Lower Right y] bound of the image, which can be plotted as follows:
Here is a link to a Colab notebook to reproduce: https://colab.research.google.com/drive/1Matvn6yqDdBld3MEv_hSFP67f6aN1dR7?usp=sharing
Hope that helps, let me know if you have any other questions :)
Thank you ! It works now !
By the way, it can not work in the ProcTHOR. The keys are wrong, but the boxes are correct. I clearly followed your code.
list(event.instance_detections2D.instance_masks.keys())
It returns:
['door|1|3', 'door|2|3', '3|5', '2|6', '3|2', '3|0|2', '2|1', 'Ceiling_room|2|0|2.61353|0', 'wall|3|15.78|5.26|15.78|8.77', 'wall|3|10.52|8.77|15.78|8.77', 'wall|2|15.78|1.75|15.78|5.26', 'wall|2|8.77|5.26|15.78|5.26', 'room|3', 'room|2']
Only 'door', 'Ceiling_room', and 'wall' can be correctly output. However, using the wrong keys (such as '3|5' and '3|0|2') can obtain the correct 2D boxes.
Here is my whole code:
from ai2thor.controller import Controller
import pickle
import cv2
# load data
f = open("/home/casia/dataset/procthor/houses.pkl", "rb")
dataset = pickle.load(f)
train_dataset = dataset["train"]
train_house_num = len(dataset["train"])
# per house
for i in range(1,train_house_num):
# init house
train_house = train_dataset[i]
controller = Controller(branch="nanna",
scene="Procedural",
renderInstanceSegmentation=True,
renderObjectImage=True)
controller.step(action="CreateHouse", house=train_house)
controller.step(action="TeleportFull", **train_house["metadata"]["agent"])
# get all possible position
event = controller.step(action="GetReachablePositions")
positions = event.metadata["actionReturn"]
rotations = [0, 45, 90, 135, 180, 225, 270, 315]
horizons = [0, 30]
# per position
for position in positions:
for rotation in rotations:
for horizon in horizons:
# get keys
event = controller.step(action="Teleport", position=position, rotation=rotation, horizon=horizon)
objects = list(event.instance_detections2D.instance_masks.keys())
img = event.cv2img.copy()
# draw boxes
bboxes = {}
for object in objects:
class_name = object.split('|')[0]
bbox = event.instance_detections2D[object]
bboxes[class_name] = list(bbox)
img = cv2.rectangle(img, (bbox[0], bbox[1]), (bbox[2], bbox[3]), (255, 0, 0), 2)
# display
cv2.namedWindow("AI2THOR", cv2.WINDOW_NORMAL)
cv2.imshow("AI2THOR", img)
key = cv2.waitKey(2000)
The keys appear to be correct to me.
ProcTHOR objects don't follow the same objectId pattern, they in general just have to appear unique. So the object that "2|6"
corresponds to can be found with:
next(obj for obj in event.metadata["objects"] if obj["objectId"] == "2|6")
OK, I see. I can find object type in metadata. Thank you!
PS: The process of finding the object in metadata seems a little cumbersome. The event.metadata["objects"]
is list and I need to use the following code to traverse the list to query.
for obj in event.metadata["objects"]:
if obj["objectId"] == objectID:
class_name = obj["objectType"]
That's a good point! I agree that it makes debugging harder.
Let me see if anybody would be against me pretending each object ID with the object type in the ProcTHOR-10K house jsons. It shouldn't really change anything unless somebody hard-coded the object ids for some reason, but it seems unlike.
Hi @pioneer-innovation,
I've taken your suggestion into account and updated all the objectIds in ProcTHOR-10K! :)
Each objectId is now prepended with its object type. Take a look: https://colab.research.google.com/drive/1aoBvg6KqBZgUT2buNOUmGQA9wjdx3F3F?usp=sharing
Note, we have also updated the distribution of ProcTHOR-10K to now use the prior package, which points to the procthor-10k repo. This makes it much easier to download, version, and use the dataset in projects, by simply installing:
pip install prior
and running:
import prior
dataset = prior.load_dataset("procthor-10k")
That is fantastic ! Thanks for your wonderful work !
Hi @pioneer-innovation and @mattdeitke,
I'm currently experiencing the problem with object IDs within instance_detections2D, using houses from procTHOR generated DS. The problem is that the keys of objects in instancedetections2D do not include children objects but only general house structures like ['window|2|0', 'door|1|2', 'Ceilingroom|2|0|2.504048|0', 'wall|2|6.86|0.00|6.86|3.43', 'wall|2|0.00|3.43|6.86|3.43', 'room|2']
Can you please suggest which version of Ai2Thor, and commits of procthor should I use to solve this problem?
Hi AI2THOR team, I want to obtain the 2D detection bounding boxes of visible objects from event.instance_detections2D. I followed the ai2thor tutorial to set up a demo, but I found that there is only an attribute named instance_masks in event.instance_detections2D and there is no information about detection bounding boxes. Here is my code:
I have tried many steps with different actions, but I still can not find any information about 2D detection boxes in event.instance_detections2D. Here is the print information: