This repo is not longer maintained. I won't respond to issues but will merge PR
DeepLab is a state-of-art deep learning model for semantic image segmentation.
Model is based on the original TF frozen graph. It is possible to load pretrained weights into this model. Weights are directly imported from original TF checkpoint.
Segmentation results of original TF model. Output Stride = 8
Segmentation results of this repo model with loaded weights and OS = 8
Results are identical to the TF model
Segmentation results of this repo model with loaded weights and OS = 16
Results are still good
Model will return tensor of shape (batch_size, height, width, num_classes)
. To obtain labels, you need to apply argmax to logits at exit layer. Example of predicting on image1.jpg
:
import numpy as np
from PIL import Image
from matplotlib import pyplot as plt
from model import Deeplabv3
# Generates labels using most basic setup. Supports various image sizes. Returns image labels in same format
# as original image. Normalization matches MobileNetV2
trained_image_width=512
mean_subtraction_value=127.5
image = np.array(Image.open('imgs/image1.jpg'))
# resize to max dimension of images from training dataset
w, h, _ = image.shape
ratio = float(trained_image_width) / np.max([w, h])
resized_image = np.array(Image.fromarray(image.astype('uint8')).resize((int(ratio * h), int(ratio * w))))
# apply normalization for trained dataset images
resized_image = (resized_image / mean_subtraction_value) - 1.
# pad array to square image to match training images
pad_x = int(trained_image_width - resized_image.shape[0])
pad_y = int(trained_image_width - resized_image.shape[1])
resized_image = np.pad(resized_image, ((0, pad_x), (0, pad_y), (0, 0)), mode='constant')
# make prediction
deeplab_model = Deeplabv3()
res = deeplab_model.predict(np.expand_dims(resized_image, 0))
labels = np.argmax(res.squeeze(), -1)
# remove padding and resize back to original image
if pad_x > 0:
labels = labels[:-pad_x]
if pad_y > 0:
labels = labels[:, :-pad_y]
labels = np.array(Image.fromarray(labels.astype('uint8')).resize((h, w)))
plt.imshow(labels)
plt.waitforbuttonpress()
from model import Deeplabv3
deeplab_model = Deeplabv3(input_shape=(384, 384, 3), classes=4)
#or you can use None as shape
deeplab_model = Deeplabv3(input_shape=(None, None, 3), classes=4)
After that you will get a usual Keras model which you can train using .fit
and .fit_generator
methods.
Useful parameters can be found in the original repository.
Important notes:
OS=8
, Xception backbone should be trained with OS=16
and only inferenced with OS=8
.This model can be retrained check this notebook. Finetuning is tricky and difficult because of the confusion between training
and trainable
in Keras. See this issue for a discussion and possible alternatives.
In order to load model after using model.save() use this code:
from model import relu6
deeplab_model = load_model('example.h5')
There are 2 available backbones. Xception backbone is more accurate, but has 25 times more parameters than MobileNetv2.
For MobileNetv2 there are pretrained weights only for alpha=1
. However, you can initiate model with different values of alpha.
The latest vesrion of this repo uses TF Keras, so you only need TF 2.0+ installed
tensorflow-gpu==2.0.0a0
CUDA==9.0
If you want to use older version, use following commands:
git clone https://github.com/bonlime/keras-deeplab-v3-plus/
cd keras-deeplab-v3-plus/
git checkout 714a6b7d1a069a07547c5c08282f1a706db92e20
tensorflow-gpu==1.13
Keras==2.2.4