bonlime / keras-deeplab-v3-plus

Keras implementation of Deeplab v3+ with pretrained weights
MIT License
1.36k stars 428 forks source link

How Load the Data with Keras, problem with Number of Channels #70

Open thepate94227 opened 5 years ago

thepate94227 commented 5 years ago

I created a DeepLab Model and want to train it with my own data. DeepLab want for Ground Truth the Labels in a stack. So for example: My image shape is (300,200), my images are RGB, i have 10 classes and 2000 images. For DeepLab my Ground truth has to be the labels in a stack. So my Ground truth must have the shape (300,200,20). If i want to use Keras for this task, i use the function flow_from_directory and there i can only choose between 1(grayscale), 3(rgb) or 4(rgba) channels. How can i load my data in the model correctly?

My real data has 3 classes, so i can pretend my ground truth is rgb and i can use Keras flow_from_directory. But in the future i will take pictures with more than 3 classes and right now my NN is not training well. It doesn't learn right, although the loss is getting lower. Sometimes i have an accuracy of 80%, but when i use the prediction like bonlime suggest, i only got zeros...

I also don't know if my data is rescaled correctly. My input images has values from 0-255, so i would want to be from 0-1. I don't know it the featurewise_std_normalization=True, will do it or not. My labels has values either 0 or 1. And i also don't know if i have to choose the number of classes in "flow_from_directory" for my labels or not.

Here is my code so far:


import os
os.environ["CUDA_VISIBLE_DEVICES"] = "1"
import numpy as np
from model import Deeplabv3
import tensorflow as tf
import time
from keras.preprocessing.image import ImageDataGenerator
from keras.callbacks import TensorBoard
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.Session(config=config)
from keras import backend as K
K.set_session(session)

batch_size = 16
num_classes = 3
num_img = 2931

NAME = "DeepLab-{}".format(int(time.time()))
deeplab_model = Deeplabv3(input_shape=(300,200,3), classes=num_classes)
tensorboard = TensorBoard(log_dir="Tensorboard/logs/{}".format(NAME))
deeplab_model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=['accuracy'])

# we create two instances with the same arguments
data_gen_args = dict(featurewise_center=True,
                     featurewise_std_normalization=True,
                     rotation_range=90,
                     width_shift_range=0.1,
                     height_shift_range=0.1,
                     zoom_range=0.2)
image_datagen = ImageDataGenerator(**data_gen_args)
mask_datagen = ImageDataGenerator(**data_gen_args)

# Provide the same seed and keyword arguments to the fit and flow methods
seed = 1

image_generator = image_datagen.flow_from_directory(
    'Keras/Input/',
    target_size=(300,200),
    class_mode=None,
    seed=seed,
    batch_size=batch_size)

mask_generator = mask_datagen.flow_from_directory(
    'Keras/Label/',
    target_size=(300,200),
    class_mode=None,
    seed=seed,
    batch_size=batch_size,
    classes=num_classes) #or maybe not??

# combine generators into one which yields image and masks
train_generator = zip(image_generator, mask_generator)

print("compiled")
deeplab_model.fit_generator(train_generator, steps_per_epoch= np.uint32(num_img / batch_size), epochs=10, callbacks=[tensorboard])
print("finish fit")
deeplab_model.save_weights('deeplab_7.h5')
deeplab_model.save('deeplab-7')

session.close()
tipani86 commented 4 years ago

Hey @thepate94227 I think this is a related issue: https://github.com/bonlime/keras-deeplab-v3-plus/issues/35

I have a similar problem but not on the output (classes) side, rather than input side.

When I load the pretrained weights, they are using a certain image configuration (usually the normal 3 channels). However, I wanted to augment my training data with an additional channel (in simplicity, what I did was just concatenated RGB and grayscale channels together to form a 4-channel image data).

However, when I try to train with pretrained weights, it says there is a shape mismatch due to the weights existing in 3 channels only. I can only train with weights set as None to randomly initialize them.

I was wondering if there is a way to loop through the input layers and generate the extra weights for the additional channel. Either randomly initializing that one channel or taking some average of the existing three or something. Any ideas?

It doesn't seem that urgent a problem, though, as training from scratch I don't see much difference in how my losses behave compared to training with pretrained weights and normal, 3-channel image data.

thepate94227 commented 4 years ago

Hey @thepate94227 I think this is a related issue: #35

I have a similar problem but not on the output (classes) side, rather than input side.

When I load the pretrained weights, they are using a certain image configuration (usually the normal 3 channels). However, I wanted to augment my training data with an additional channel (in simplicity, what I did was just concatenated RGB and grayscale channels together to form a 4-channel image data).

However, when I try to train with pretrained weights, it says there is a shape mismatch due to the weights existing in 3 channels only. I can only train with weights set as None to randomly initialize them.

I was wondering if there is a way to loop through the input layers and generate the extra weights for the additional channel. Either randomly initializing that one channel or taking some average of the existing three or something. Any ideas?

It doesn't seem that urgent a problem, though, as training from scratch I don't see much difference in how my losses behave compared to training with pretrained weights and normal, 3-channel image data.

Hey @tipani86,

unfortunately i don't know exactly what you can do. Your ideas seem to be worth trying 👍 For my task i needed an instance segmentation, so i used Masj R-CNN. One another thing you can try is to create the pertained weights yourself. You load all the train images, convert them to a 4 channel images like you data, and then train the model. This is maybe the best way to get the best petrained weight, but it takes a lot of time and maybe you have to adjust hyperparamters.