Closed mluerig closed 6 years ago
Please see the example for how to convert the annotations to numpy array. https://github.com/wkentaro/labelme/blob/master/labelme/cli/json_to_dataset.py
Also, it has option --nodata
which does not save imageData
.
You can use imagePath
which is relative to the annotation file (json file).
https://github.com/wkentaro/labelme#usage
ah excellent, thanks. probably couldn't hurt to point to this in the readme
Ah, sorry. It is not described in the README.md on the top page. And you need to refer to: https://github.com/wkentaro/labelme/tree/master/examples/single_image#convert-to-dataset
jupp, I saw that and as you already implemented with the user-warning it's fine for single files, but not for actual workflow to create a full data set. it might be nice to point to that utility script somewhere (e.g. in the tutorial so it is clear how people can customize their own workflow. I feel like the need to adjust the format of the labels and training images to segmentation models is a common issue out there
Recently I added script that converts (multiple) json files to a VOC-like dataset (labelme2voc.py
) for semantic segmentation and instance segmentation as examples.
Are they what you are pointing out?
Actually there is no script that converts labelme's json files to COCO format. (because I usually create my own script that converts json files to a numpy's npz files which can be easily loaded and used for training) https://github.com/wkentaro/labelme/issues/34 is the issue for that.
ah yes, nice!
Anyone have a script for converting multiple XML polygon or binary PNG segmentations to single json?
This code takes .json
created using labelme
as input and outputs object instance masks in binary .png
images and saves both images and their masks in a neat folder structure. This code is adapted from label2voc.py
by wkentaro. This code is more suitable for people using Mask-RCNN. If you have any queries ask me.
import glob
import json
import os
import os.path as osp
import numpy as np
import PIL.Image
import cv2
import labelme
DATA_DIR = '/home/war/Downloads/initial_experiment/train'
OUT_DIR = DATA_DIR + '/data_and_masks'
class_names = []
class_name_to_id = {}
for i, line in enumerate(open(DATA_DIR + '/labels.txt').readlines()):
class_id = i - 1 # starts with -1
class_name = line.strip()
class_name_to_id[class_name] = class_id
if class_id == -1:
assert class_name == '__ignore__'
continue
elif class_id == 0:
assert class_name == '_background_'
class_names.append(class_name)
class_names = tuple(class_names)
print('class_names:', class_names)
out_class_names_file = osp.join(DATA_DIR, 'class_names.txt')
with open(out_class_names_file, 'w') as f:
f.writelines('\n'.join(class_names))
print('Saved class_names:', out_class_names_file)
if osp.exists(OUT_DIR):
print('Output directory already exists:', OUT_DIR)
quit(1)
os.makedirs(OUT_DIR)
for label_file in sorted(glob.glob(osp.join(DATA_DIR, '*.json'))):
with open(label_file) as f:
base = osp.splitext(osp.basename(label_file))[0]
data = json.load(f)
img_file = osp.join(osp.dirname(label_file), data['imagePath'])
img = np.asarray(PIL.Image.open(img_file))
lbl = labelme.utils.shapes_to_label(
img_shape=img.shape,
shapes=data['shapes'],
label_name_to_value=class_name_to_id,
)
instance1 = np.copy(lbl)
instance2 = np.copy(lbl)
pos_1 = np.where(lbl==1)
pos_2 = np.where(lbl==2)
instance1[pos_2] = 0
instance2[pos_1] = 0
instance2 = instance2 / 2
instance1 = instance1*255
instance2 = instance2*255
os.makedirs(osp.join(OUT_DIR, base))
os.makedirs(osp.join(OUT_DIR, base,'images'))
PIL.Image.fromarray(img).save(osp.join(OUT_DIR, base, 'images', base + '.png'))
os.makedirs(osp.join(OUT_DIR, base, 'masks'))
cv2.imwrite(osp.join(OUT_DIR, base, 'masks', base + '_instance1' + '.png'), instance1)
cv2.imwrite(osp.join(OUT_DIR, base, 'masks', base + '_instance2' + '.png'), instance2)
Would this be useful to generalize for more labels?
Does this code also works for .json files created by VGG image annotator?
Thank you everyone!
This code takes
.json
created usinglabelme
as input and outputs object instance masks in binary.png
images and saves both images and their masks in a neat folder structure. This code is adapted fromlabel2voc.py
by wkentaro. This code is more suitable for people using Mask-RCNN. If you have any queries ask me.import glob import json import os import os.path as osp import numpy as np import PIL.Image import cv2 import labelme DATA_DIR = '/home/war/Downloads/initial_experiment/train' OUT_DIR = DATA_DIR + '/data_and_masks' class_names = [] class_name_to_id = {} for i, line in enumerate(open(DATA_DIR + '/labels.txt').readlines()): class_id = i - 1 # starts with -1 class_name = line.strip() class_name_to_id[class_name] = class_id if class_id == -1: assert class_name == '__ignore__' continue elif class_id == 0: assert class_name == '_background_' class_names.append(class_name) class_names = tuple(class_names) print('class_names:', class_names) out_class_names_file = osp.join(DATA_DIR, 'class_names.txt') with open(out_class_names_file, 'w') as f: f.writelines('\n'.join(class_names)) print('Saved class_names:', out_class_names_file) if osp.exists(OUT_DIR): print('Output directory already exists:', OUT_DIR) quit(1) os.makedirs(OUT_DIR) for label_file in sorted(glob.glob(osp.join(DATA_DIR, '*.json'))): with open(label_file) as f: base = osp.splitext(osp.basename(label_file))[0] data = json.load(f) img_file = osp.join(osp.dirname(label_file), data['imagePath']) img = np.asarray(PIL.Image.open(img_file)) lbl = labelme.utils.shapes_to_label( img_shape=img.shape, shapes=data['shapes'], label_name_to_value=class_name_to_id, ) instance1 = np.copy(lbl) instance2 = np.copy(lbl) pos_1 = np.where(lbl==1) pos_2 = np.where(lbl==2) instance1[pos_2] = 0 instance2[pos_1] = 0 instance2 = instance2 / 2 instance1 = instance1*255 instance2 = instance2*255 os.makedirs(osp.join(OUT_DIR, base)) os.makedirs(osp.join(OUT_DIR, base,'images')) PIL.Image.fromarray(img).save(osp.join(OUT_DIR, base, 'images', base + '.png')) os.makedirs(osp.join(OUT_DIR, base, 'masks')) cv2.imwrite(osp.join(OUT_DIR, base, 'masks', base + '_instance1' + '.png'), instance1) cv2.imwrite(osp.join(OUT_DIR, base, 'masks', base + '_instance2' + '.png'), instance2)
Can this be used to extract masks of semantic segmantation from .json file of labelme ??
labelme saves everything (image and labels/polygons) in json files. I am trying to decode the image string that is stored along with the polygon-points in the json files in python, so I can convert both to a numpy-array and use the points to draw a binary mask (some models like mask rcnn don't use json, but require numpy-arrays for training).
What type of encoding is used to store the image in the json files? (under "imageData")