facebookresearch / habitat-lab

A modular high-level library to train embodied AI agents across a variety of tasks and environments.
https://aihabitat.org/
MIT License
1.93k stars 483 forks source link

Question about the mismatch between semantic images and RGB images #899

Open ghost opened 2 years ago

ghost commented 2 years ago

Habitat-Lab and Habitat-Sim versions

Habitat-Lab: 0.2.2 Habitat-Sim: 0.2.2

Docs and Tutorials

Did you read the docs? YES Did you check out the tutorials? YES Perhaps your question is answered there. If not, carry on!

❓ Questions and Help

Mismatch of semantic images and RGB images. I use the Object-Nav for MP3D configs to create a simple demo and the configs are listed as follows:

ENVIRONMENT:
  MAX_EPISODE_STEPS: 500
SIMULATOR:
  TURN_ANGLE: 30
  TILT_ANGLE: 30
  ACTION_SPACE_CONFIG: "v1"
  AGENT_0:
    SENSORS: ['RGB_SENSOR', 'DEPTH_SENSOR', 'SEMANTIC_SENSOR']
    HEIGHT: 0.88
    RADIUS: 0.18
  HABITAT_SIM_V0:
    GPU_DEVICE_ID: 0
    ALLOW_SLIDING: False
  SEMANTIC_SENSOR:
    WIDTH: 256
    HEIGHT: 256
    HFOV: 79
    POSITION: [0, 0.88, 0]
  RGB_SENSOR:
    WIDTH: 256
    HEIGHT: 256
    HFOV: 79
    POSITION: [0, 0.88, 0]
  DEPTH_SENSOR:
    WIDTH: 256
    HEIGHT: 256
    HFOV: 79
    MIN_DEPTH: 0.5
    MAX_DEPTH: 5.0
    POSITION: [0, 0.88, 0]
TASK:
  TYPE: ObjectNav-v1
  POSSIBLE_ACTIONS: ["STOP", "MOVE_FORWARD", "TURN_LEFT", "TURN_RIGHT", "LOOK_UP", "LOOK_DOWN"]
  SENSORS: ['OBJECTGOAL_SENSOR', 'COMPASS_SENSOR', 'GPS_SENSOR']
  GOAL_SENSOR_UUID: objectgoal
  MEASUREMENTS: ['DISTANCE_TO_GOAL', 'SUCCESS', 'SPL', 'SOFT_SPL']
  DISTANCE_TO_GOAL:
    DISTANCE_TO: VIEW_POINTS
  SUCCESS:
    SUCCESS_DISTANCE: 0.1

DATASET:
  TYPE: ObjectNav-v1
  SPLIT: val
  DATA_PATH: "/home/henry/Desktop/dataset/rl_envs/habitat-lab/data/datasets/objectnav/mp3d/v1/{split}/{split}.json.gz"
  SCENES_DIR: "/home/henry/Desktop/dataset/rl_envs/habitat-lab/data/scene_datasets/"

And the python code are listed as follows:

import habitat
import cv2
import numpy as np
from habitat_sim.utils.common import d3_40_colors_rgb
import os
os.environ["MAGNUM_LOG"] = "quiet"
os.environ["HABITAT_SIM_LOG"] = "quiet"
def make_rgb(rgb_obs):
    return rgb_obs
def make_semantic(semantic_obs):
    semantic_image = np.zeros((semantic_obs.shape[0],semantic_obs.shape[1],3),np.uint8)
    semantic_image = np.resize(d3_40_colors_rgb[semantic_obs.flatten()%40],semantic_image.shape)
    return semantic_image
def make_depth(depth_obs):
    heatmap = cv2.applyColorMap((depth_obs*255.0).astype(np.uint8), cv2.COLORMAP_HOT)
    return heatmap 

config = habitat.get_config("./config/imagenav_mp3d.yaml")
env = habitat.Env(config=config)
obs = env.reset()
while not env.episode_over:
    while True:
        key = cv2.waitKey(1)
        if key == 97:
            action = 2
            break
        elif key == 100:
            action = 3
            break
        elif key == 119:
            action = 1
            break
        cv2.imshow("image",obs['rgb'])
        cv2.imshow("semantic",make_semantic(obs['semantic']))
        cv2.imshow("depth",make_depth(obs['depth']))
    obs = env.step(action)

But the output semantic images seem to be captured from a top-down view, maybe I got the wrong configuration? image

aclegg3 commented 2 years ago

https://github.com/facebookresearch/habitat-sim/issues/1808