minosworld / minos

MINOS: Multimodal Indoor Simulator
MIT License
201 stars 33 forks source link

Getting sensor output from 'observations' #22

Open kvas7andy opened 6 years ago

kvas7andy commented 6 years ago

Hi! There are some questions, I would like to ask about sensor data and its processing. I can divide this question in several one, if this is more convenient. And I would be very grateful for your detailed answers, as my own attempts were unsuccessful. All questions are related to API from RoomSimulator.py (which is additionally used in unreal implementation, I use for research)

  1. step() function returns whole state of the simulator. How and what parameters influence sensor output, from simulator? observations from sim_config.py parameter obviously do this, but there is one more, called outputs, which has 'color', included in a list of values. Does it influence our sensor outputs? Does it influence reward or goal?

  2. After plenty of time searching in _pygameclient.py, I couldn't guess how to get the code for recoloring of rooms & objects, while receiving them from observations, as one of included 'sensors'? As I understood, only Red color is actually indicating label of object (room?). I would like to color this labels, but the only implementing script is in _pygameclient.py. In addition, how should roomtypes_file or objecttypes_file looks like (these are "silence" parameters in parser of _simargs.py), I mean, what is the whole list of available labels of objects and rooms? At the same time I found some metadata file for Matterport3D original coloring, but can't find the same for SUNCG (in there repository).

  3. Whatt does data_viz stand for, and can this data be generated automatically for depth and semantic segmentation? Even for color data, with standard configuration files, I get None, when check the value from sensors dictionary.

  4. I have downloaded suncg.v2.zip and mp3d.zip from download links in corresponding emails. As I see mp3d.zip file (archive for minos task) doesn't include region_segmentation, as actually objects semantic segmentation in files. It would be great, if you could include object segmentation data into mp3d archive for minos task! And the question is, rather objectType from mp3d is already implemented along with suncg segmentation (as I can get this information from SUNCG in observations)

angelxuanchang commented 6 years ago

@kvas7andy Thank you for trying out MINOS! Here are some answers to your questions (sorry for the late response).

  1. The step() function returns observations captured by the agent at the end of the step. Which observations are returned is controlled by the observations field (e.g. setting color to True will make sure that the color frame is returned as part of the observations dictionary). This allows for easy enabling/disabling of observations with the same sensor configuration. The outputs field is used internally by the learning agents (unreal, gym example, dfp) to keep track of what observations/reward/goal they want to process. It is also used as an input parameter to RoomSimulator.get_observation_space to obtain the shapes of the observations. Internally, it is also used by the agents to keep track of the history of observations and to be used in the code logic for hooking up the agent networks based on what modalities are available.

  2. The rooms and objects are encoded using a 1-hot encoding. See https://github.com/minosworld/minos/issues/14 for the details of this encoding. The coloring is just for visualization purposes.

  3. The data_viz field is a conversion of the frame in the data field only for visualization purposes. It is used for the semantic encoding of rooms and objects (since it is hard to detect differences when everything is encoded in a small bit range -- see 1-hot encoding above). It is not used for color or depth. It is provided for semantic frames when visualize_sensors is set to True. See https://github.com/minosworld/minos/blob/master/minos/tools/pygame_client.py for an example.

  4. The object segmentation data is not available for the mp3d dataset yet. We are currently working on making this data available for MINOS.

msavva commented 6 years ago

@kvas7andy Are you able to use the sensor data as you would expect? If yes, shall we close this issue? I will summarize some of the details of this discussion in a FAQ document so that others can also refer to it.

kvas7andy commented 6 years ago

@msavva it would be great to have Friday to make tests!

kvas7andy commented 6 years ago

@msavva @angelxuanchang Thank you for your explanations! I found useful all information you provided to me. About 4th point, still it is vital to me to use semantic segmentation data from mp3d dataset! Please, could you provide any code to get such data? I couldn't get this data via demo.py code, which works well for suncg dataset (not mp3d).

kvas7andy commented 6 years ago

@msavva @angelxuanchang Could you please advice any workaround to get object semantic segmentation data from Matterport3D dataset? It is still vital question. Thank you in advance! P.S. there are several new issues about tiny bugs and enhancements. Your advice would be very helpful!

msavva commented 6 years ago

@kvas7andy We are currently working on an update that adds Matterport3D semantic segmentation data. It should be coming within a few days.

kvas7andy commented 6 years ago

@msavva I am very glad to hear that! I truly appreciate your work!

kvas7andy commented 6 years ago

Hi,

I have to continue this thread, because some misunderstanding occurred, while I was generating objectType masks. I have 2 questions:

[1]. I used new code for saving objectType data, as plt.imsave(img) save blue-kind images, with labels encoded in B channel. For saving this masks as grayscale images improperly, I used this code:

object_type = observation["observation"]["sensors"]["objectType"]["data"]
if len(np.unique(object_type[:, :, 2].astype(int))) < 3: # save images with more than 3 classes
    print(prename + ' rejected!')
    sys.stdout.flush()
    return
object_type = object_type.reshape((object_type.shape[1], object_type.shape[0], object_type.shape[2]))
img = Image.fromarray(object_type[:,:,2].astype(np.uint8),  mode='L')
img.save(prename + 'object_type_labels.png')

At the same time I save object_type[:,:, 2] as text, to validate my results. I opened my image with np.array(Image.open("0_1_object_type_labels.png").convert("L").getdata()), which get me right result.

But I can't compare obtained labels with one in here from sstk-metadata repository. Links in #14 issue are now depreciated. So, how to know real label names, obtained from observations? Are they always right? I.e. if trying sstk-metadata link I would get that, there is ottomon {17} and stand {21} (as well as wall {1}) on this picture (attached here)

colored picture picture from "data_viz" saved labels from "data", as grayscale image labels in txt format

"Sink" is missed, "kitchen_appliance" and "kitchen_cabinet" are switched to "ottomon" and "stand". Is this any mismatch of renderer or wrong .csv file?

[2]. When training other labels, then ["ark", "door"], how can I specify to randomize episodes training with concrete set of objectTypes? How to specify to choose only scenes which have predefined objectTypes' set or choose objectType from predefined set for concrete scene, so that there would be no error, that chosen objectType is missed, and can't be reached by an agent a prior?

kvas7andy commented 6 years ago

[1]. Sry, for so long text, I came up to solution. I forgot to specify concrete --objecttypes_file parameter!

[2]. This question is still not clear for me.

msavva commented 6 years ago

Hi @kvas7andy ,

To confirm, problem [1] was due to the missing parameter specification for --objecttypes_file, correct?

For problem [2], you will currently need to filter the scene train/val/test sets by checking for the object label you want to use as targets. To do this you can refer to this CSV file which contains a column named modelCats giving all object category labels present in each SUNCG scene. You can then take the intersection of the scene ids in this file with appropriate object labels and the ids in the SUNCG train/val/test split file to create an adjusted split file for your task. As this is somewhat tedious right now, logic to handle this automatically would be a good future release improvement -- we will work on it.

kvas7andy commented 6 years ago

Hi @msavva,

Yes, first problem occurred only because of missing --objecttypes_file parameter in command line.

Quit good workaround for now, thank you. Of course, it would be much better to implement such filtering inside the framework, but I think the least important issue.

kvas7andy commented 6 years ago

I have to reopen issue and ask: how to get information about concrete object, chosen by simulator for current episode? The necessary information is its type?

For example, apart from "arch", "door" categories I can state another one and be sure, that with 'select': 'random', simulator will uniformly choose one of the category and find it in the scene? So I need to filter all scenes which have all of specified modelCats or at least one of them?

Simulator choose objectId, which can be viewed with self.start_config_this_episode, but I can't distinguish, what is the object type of this objectId.

Example of output I talk about:

sim00:EPINFO:6543,
{'shortestPath': {'isValid': True, 'rooms': [3], 'doors': ['0_33'], 'distance': 3.548528137423857},
 'sceneId': 'p5dScene.a089f7c02c74d61284bc8d43dd5e23fd',
 'start': {'position': [-42.731042127232556, 0.5950000007450583, -38.62511311704531], 
'room': '0_2', 'angle': 3.159379655709538, 'cell': {'isValid': True, 'i': 64, 'id': 6082, 'j': 51}},
 'goal': {'position': [-46.06749974563718, 1.1899999368091043, -38.160003507509536], 
'cell': {'isValid': True, 'i': 31, 'id': 6521, 'j': 55}, 
'bbox': {'min': [-46.124999795109034, 0.03999995944889734, -38.81500345397298], 'max': [-46.00999969616532, 2.3399999141693115, -37.505003561046095]}, 
'objectId': '0_33', 'initialOffsetFromAgent': [0, 0, 0]}, 'task': 'object_goal'}
kvas7andy commented 6 years ago

If stating as usual ["arch", "door"] the output example of episode_info from simulator is:

{
  'sceneId': 'p5dScene.69e283eec2f85eb53d528dcfb2172ab9',
  'goal': {
    'objectId': '0_5',
    'cell': {
      'isValid': True,
      'i': 18,
      'j': 4,
      'id': 378
    },
    'initialOffsetFromAgent': [
      -43.4512477517128,
      0.01999997699248368,
      -44.27999892830849
    ],
    'position': [
      -44.10126876831055,
      1.0799999217390397,
      -44.36999898031354
    ],
    'bbox': {
      'max': [
        -43.4512477517128,
        2.1399998664855957,
        -44.27999892830849
      ],
      'min': [
        -44.751289784908295,
        0.01999997699248368,
        -44.45999903231859
      ]
    }
  },
  'shortestPath': {
    'isValid': True,
    'rooms': [
      1,
      2
    ],
    'doors': [
      '0_6',
      '0_5'
    ],
    'distance': 3.6384776310850238
  },
  'task': 'object_goal',
  'start': {
    'room': '0_0',
    'angle': 5.289163787079079,
    'position': [
      -44.74203020745983,
      0.5950000007450584,
      -41.43543911813546
    ],
    'cell': {
      'isValid': True,
      'i': 12,
      'j': 34,
      'id': 3072
    }
  }
}

but when I state another labels ["window", "toilet"]:

{
  'goal': {
    'bbox': {
      'min': [
        -45.85625401139259,
        0.05070168693782762,
        -42.45634177327156
      ],
      'max': [
        -45.14885500073433,
        0.9144850373268127,
        -42.10365578532219
      ]
    },
    'in itialOffsetFromAgent': [
      -6.9892619454209735,
      0,
      -0.06785018490302952
    ],
    'objectId': '0_8',
    'position': [
      -45.50255450606346,
      0.4825933621323202,
      -42.279998779296875
    ],
    'cell': {
      ' isValid': True,
      'i': 4,
      'id': 2254,
      'j': 25
    }
  },
  'shortestPath': {
    'isValid': False
  },
  'task': 'object_goal',
  'sceneId': 'p5dScene.69e283eec2f85eb53d528dcfb2172ab9',
  'start': {
    'ro om': '0_3',
    'position': [
      -38.74433110007286,
      0.5950000007450585,
      -39.30428803614075
    ],
    'cell': {
      'isValid': True,
      'i': 72,
      'id': 5022,
      'j': 55
    },
    'angle': 5.880824631535051
  }
}

Lots of parameters are missing, but crucial is objectType (which can't be infered from objectId)

I got this after looking inside measure_fun, in order to rewrite it in a way as class MeasureGoalRoomType (and see sequential results), but for GoalObjectType. Still no solution.

kvas7andy commented 6 years ago

Hi, @msavva!

I am sorry to rush, but this question is vital for me. Please, could you make any suggestions ASAP!

Thank you)

msavva commented 6 years ago

Hi @kvas7andy , You are correct that the objectType of the goal object is not returned as part of the observation. It is possible to retrieve the objectType by resolving the objectId within a specific SUNCG scene to a model id. Here's an example python snippet of how to get the objectType given the objectId:

import csv
import json
import os

SUNCG_PATH = os.path.expanduser('~/work/suncg/')
MODEL_METADATA_FILE = os.path.expanduser('~/code/minos/minos/server/node_modules/sstk/metadata/data/suncg/suncg.planner5d.models.full.csv')

model_id2cat = {}
for r in csv.DictReader(open(MODEL_METADATA_FILE)):
    model_id2cat[r['id']] = r['category']

def object_id_to_type(scene_id, object_id):
    house = json.load(open(os.path.join(SUNCG_PATH, 'house', scene_id, 'house.json')))
    object_nodes = house['levels'][0]['nodes']
    model_id = [x for x in object_nodes if x['id'] == object_id][0]['modelId']
    object_type = model_id2cat[model_id]
    return object_type

This is a clunky, temporary workaround. We will incorporate passing back of objectType in addition to objectId in a near future update.

kvas7andy commented 6 years ago

@msavva, Thank you very much! I am checking this solution right now. Please, can you definitely say that object categories, which I stated in category are uniformly sampled with 'select':'random'? And scenes, which I filtered, should have models of all the categories stated env_config, or with at least one of them?

msavva commented 6 years ago

@kvas7andy The goal selection logic picks all objects that have any of the specified goal categories in category, and then uniformly samples from this set of candidate goal objects when select is set to random. Thus, the filtered scenes should contain at least one object belonging to any one of the goal categories.

kvas7andy commented 6 years ago

Thank you for your help, @msavva!

chenwydj commented 5 years ago

Hi @msavva ! I wonder if the Matterport3D semantic segmentation data is ready to use? I'm still trying to use either "objectTypes" or "regions" to load the Matterport3D semantic segmentation data. If it's not ready, I will stop trying. Thank you!