bonlime / keras-deeplab-v3-plus

Keras implementation of Deeplab v3+ with pretrained weights
MIT License
1.36k stars 428 forks source link

Problem with Training/Prediction (Different shape and number of classes) #72

Open thepate94227 opened 5 years ago

thepate94227 commented 5 years ago

I first explain my task: I have nearly 3000 images with the shape (300,200) from two different ropes. They contain rope 1, rope 2 and the background. My Labels/Masks are images, where for example the pixel value 0 represents the background, 1 represents the first rope and 2 represents the second rope. For DeepLab i converted my ground truth to an image, where each class is stacked into the image. In my case i have 3 classes: rope red, rope blue and the background. So I created one label for my background with the shape (300,200), one for the red rope with the shape (300,200) and one for the blue rope with the same shape. Then i stacked the 3 images into one image, therfore the final label for my NN hast the shape (300,200,3).

You can see the input picture and the 3 labels below. Notice that my ground truth is as i described it above, all the three labels stacked into one image, the values are either 0 or 1.

seile_000001_resized_cutlabel000label111label222

Now i created a DeepLab model and simply trained my model like in the Keras Tutorial:

import numpy as np
from model import Deeplabv3
import time
from keras.preprocessing.image import ImageDataGenerator
from keras.callbacks import TensorBoard

batch_size = 16
num_classes = 3
num_img = 2931

NAME = "DeepLab-{}".format(int(time.time()))
deeplab_model = Deeplabv3(input_shape=(300,200,3), classes=num_classes)
tensorboard = TensorBoard(log_dir="Tensorboard/logs/{}".format(NAME))
deeplab_model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=['accuracy'])

# we create two instances with the same arguments
data_gen_args = dict(featurewise_center=True,
                     featurewise_std_normalization=True,
                     rotation_range=90,
                     width_shift_range=0.1,
                     height_shift_range=0.1,
                     zoom_range=0.2)
image_datagen = ImageDataGenerator(**data_gen_args)
mask_datagen = ImageDataGenerator(**data_gen_args)

# Provide the same seed and keyword arguments to the fit and flow methods
seed = 1
#image_datagen.fit(images, augment=True, seed=seed)
#mask_datagen.fit(masks, augment=True, seed=seed)

image_generator = image_datagen.flow_from_directory(
    'KerasCol/Input/',
    target_size=(300,200),
    class_mode=None,
    seed=seed,
    batch_size=batch_size,
    color_mode='rgb')

mask_generator = mask_datagen.flow_from_directory(
    'KerasCol/Label/',
    target_size=(300,200),
    class_mode=None,
    seed=seed,
    batch_size=batch_size)

# combine generators into one which yields image and masks
train_generator = zip(image_generator, mask_generator)

print("compiled")

deeplab_model.fit_generator(train_generator, steps_per_epoch= np.uint32(num_img / batch_size), epochs=10, callbacks=[tensorboard])

print("finish fit")
deeplab_model.save_weights('deeplab_col2.h5')
deeplab_model.save('deeplab-col2')

So I simply loaded the DeepLab Model, created two generators for my input and my labels by using Keras flow_from_directory function and then used the model.fit_generator function.

My result is a model with over 80% accuracy: kerascol2

After saving my model, in another Python File i loaded my model to use it for prediction like bonlime showed, but because i have another shape (300,200), i customized it a bit:


from PIL import Image, ImageEnhance
import keras
from matplotlib import pyplot as plt
import cv2 # used for resize. if you dont have it, use anything else
import numpy as np
from model import relu6, BilinearUpsampling

deeplab_model = keras.models.load_model('deeplab-col2',custom_objects={'relu6':relu6,'BilinearUpsampling':BilinearUpsampling })

img = np.array(Image.open("KerasCol/Input/0/0001.png"))
w, h, _ = img.shape
img = img / 127.5 - 1.
#pad_x = int(512 - resized.shape[0])
resized2 = np.pad(img,((0,0),(0,0),(0,0)),mode='constant')
resized2 = np.expand_dims(resized2,0)
res = deeplab_model.predict(resized2)
labels = np.argmax(res.squeeze(),-1)
plt.imshow(labels)
plt.show()

My problem is: the label i received as a prediction result is an image full of zeros. I tried it with many images, but the result is still the same. That means that either my prediction code is wrong or my train code. Keras showed that my accuracy is over 80%. Can this be wrong? Or is my customized prediction code wrong?

One problem is that if you have for example 21 classes, DeepLab expect a label with the shape (width,height,21). But Keras flow_from_directory can only read grayscale(width,height,1), RGB (width,height,3) and RGBA(width,height,4), see my post here: https://github.com/bonlime/keras-deeplab-v3-plus/issues/70 Right now i have 3 classes and i pretend that the labels are RGB images, so i got no errors, but maybe this is the problem why i get a high accuracy, but it isn't working...

FreedomGu commented 5 years ago

same problem as yours

blaxe05 commented 5 years ago

@thepate94227 hi...so do you know how to solve the issue of feeding data using flow_from_directory when the mask is of size (384x384x5)? the model train for several epochs and then threw "incompatible shape error". Appreciate if anyone have any idea!

thepate94227 commented 5 years ago

@blaxe05 Hi, I am sorry, but because of this problem and other problems i didn't use Deeplab. In my case i used Mask R-CNN. So unfortunately i don't know the solution for this problem...

TimbusCalin commented 5 years ago

I also ran into this issue, maybe we make a mistake at a point during training/inference.

TimbusCalin commented 5 years ago

@blaxe05 Hi, I am sorry, but because of this problem and other problems i didn't use Deeplab. In my case i used Mask R-CNN. So unfortunately i don't know the solution for this problem...

So you didn't manage to use DeepLabV3 Plus eventually...

MatthiasSchinzel commented 5 years ago

I had a similar problem. Reducing the batch size to one solved the problem for me, but I don't understand why.

I was training on a 1080ti with mobilenet v2 as backbone.