Closed chrockey closed 7 months ago
Doubling down on this question. Particularly on the part about Room Segmentation in HM3D.
@Buzz-Beater could you provide code or another way to obtain which part of a scene corresponds to one of your segmentations? We're currently planning to utilize the HM3D annotations from your dataset, we will not utilize the pointclouds provided, but the habitat simulator directly. Without knowing how the room segmentations are computed the data is not usable.
@chrockey I can help with the second question. I can recommend the following two options:
Thank you
@SergioArnaud Thanks for the help!
I thought there were ground truth RGBD videos captured by humans since HM3D is real, not synthetic. Are RGBD videos of OmniData the original (ground truth) RGBD videos of HM3D? or just rendered video using a simulator?
Nope, they are not videos but 3D scans from real world environments. Omnidata will not return a video but a set of observations from this 3D scan
Omnidata will not return a video but a set of observations from this 3D scan
Oh, I see. In that case, the set of observations from Omnidata does not necessarily cover the full 3D scene. It could be partial. Is my understanding correct?
That is correct. You can always oversample and get a ton of frames to get close to full coverage. Or do some fancy tricks on top of omnidata.
If you want full coverage, using an agent to do frontier exploration on habitat might be an easier path (if you're familiar with habitat and frontier exploration)
Also interested in knowing how the sceneverse authors did it. @Buzz-Beater
Hi,
The room segmentation from HM3D
can be found as follows.
scan_id
is formatted by {scene_id}_{room_id}
, e.g., {00006-HkseAnWCgqk}_{sub002}
.room_id
can be found by
# decompose scene mesh into subroom
scene = trimesh.load(glb_dir, file_type='glb')
room_dict = dict()
for name, _g in scene.geometry.items():
group_name = name.split('_')[2] # group_name is the room_id
if group_name not in room_dict:
room_dict[group_name] = []
room_dict[group_name].append(_g)
Regarding the images/object captions in HM3D, we only released the templated-based object captions in the current version. One easy way to extract images is to use the habitat simulator, as @SergioArnaud also mentions.
@SergioArnaud and @yixchen Thanks for the reply! I appreciate your help with this :)
@yixchen
Thank you so much for the answer! Do you have any pointers on how to use this constrained mesh information in habitat?
I'm not sure how to find room_id in habitat, but one workaround you can try is to extract the (rough) layout/floor map information from the mesh file and use it to locate the objects in the simulator.
What I understand from the snippet of code is that you have ground truth annotations for HM3D, then you're using the HM3D semantics dataset, not HM3D. If that's the case you should also cite HM3D-semantics on the paper, that way is much easier to understand how you got the room annotations.
Thank you for the help @yixchen
Yes, we use annotations from HM3D semantics. We will further clarify and add the citation in the revised version. Thanks.
Hi,
I have two questions regarding your use of HM3D for SceneVerse:
Thank you!