How to parse the pixel class/instance label of images at semantic and semantic_pretty directory

alexsax commented 7 years ago

Semantic images come in two variants, semantic and semantic_pretty. They both include information from the point cloud annotations, but only the semantic version should be used for learning! The labels can be found in assets/semantic_labels.json, and images can be parsed using some of the convenience functions in utils.py. Specifically: The semantic images are encoded as 3-channel 8-bit PNGs which are interpreted as 24-bit base-256 integers which are an index into the labels array in semantic_labels.json.

To make this concrete, take the following semantic panorama:

camera_04a287849657478ea774727e5bff5202_office_3_frame_equirectangular_domain_semantic

Let's say that you've loaded the image into memory and it's stored as a numpy array called img and want the label for the pixel at (1500, 2000) which is the leftmost sofa chair in this image. utils.py provides get_index, load_labels and parse_labels for extracting the label information. Here is what your code might look like:

from scipy.misc import imread
from assets.utils import *  # Assets should be downloaded from this repo
labels = load_labels( '/path/to/assets/semantic_labels.json' )

img = imread(  '/path/to/image.png' )
pix = img[ 1500,2000 ]
instance_label = labels[ get_index( pix ) ]
instance_label_as_dict = parse_label( instance_label )
print instance_label_as_dict

Gives {'instance_num': 5, 'instance_class': u'sofa', 'room_num': 3, 'room_type': u'office', 'area_num': 3} Here we can see that this is the 5th instance of class 'sofa' in area 3.

Finally, note that pixels where the data is missing are encoded with the color #0D0D0D which is larger than the len( labels ).

xiahouzuoxin commented 7 years ago

I've find assets/semantic_labels.json now, thanks very much

alexsax commented 7 years ago

Yes thank you for pointing this out! We forgot to include that file in our initial release :)

ustundag commented 5 years ago

Hi all, I want to leave a piece of comment for prospective developers visiting here.

Firstly thank you all for this kind of great research. I noticed that imread function is deprecated and SciPy stopped giving support to it. They refer to imageio.imread instead.

In the beginning, I've started with OpenCV to read images, then I barely realized the difference on returned list values. OpenCV returns the pixel values in a reversed list while imageio returns in the same way as SciPy previously did.

So, we better use Imageio instead of using other image processing libraries such as OpenCV, Matplotlib. Maybe you can also update the utils.py.

alexsax / 2D-3D-Semantics

How to parse the pixel class/instance label of images at semantic and semantic_pretty directory #6