Closed KlaymenGC closed 8 years ago
Can't You pass the labels as a flattened vector and then use binary_crossentropy
loss?
@elanmart thanks for the reply, could you please be more precise? The desire output will be 500x300x21 (21 classes to classify), in each channel of the output, an activation map will be generated showing the probability of each pixel belonging to this class.
Hey, sorry for the late reply.
I was actually wrong. Here's the correct way: Keras' softmax can be applied to 3D tensors, where the softmax is computed for the last dimension.
Your predictions will be batch_size x 500 x 300 x 21
. Flatten them into batch_size x 1500 x 21
and apply softmax. Now all you have to do is supply flattened labels with shape batch_size x 1500
, where each element is a scalar indicating desired labels.
At least that's what I'd do, ping @EderSantana?
@elanmart Again, thank you so much for your reply! I was thinking the same about getting the probability map using the softmax as you described. It may be dumb but the problem for me now is that I haven't figured out how to train the network using Keras,
model.compile(loss='binary_crossentropy', optimizer='sgd')
this won't work... could you please shed some light on this one? Thanks in advance!
Here is one thing about Keras, it always assume that your dimensions are (samples, dim)
or (samples, time, dim)
. In the second case, when calculating the cost you vectors will be reshaped to (samples*time, dim)
. If you have something like (samples, time, dim1, dim2)
they will be reshaped to (samples*time*dim1, dim2)
which will average your cost function in a wrong way.
So, whenever you can, always reshape your output data to something like (samples, time, dim1*dim2*...)
.
I'll open an issue about that and work on it later.
@KlaymenGC The following code should work for You. The ugly part is that you actually need to pass your labels as 3d one-hot matrix. Perhaps this could be fixed in the future?
from keras.layers.core import Dense, Dropout, TimeDistributedDense, Reshape, Activation
NUM_LABLES=21
NUM_EXAMPLES=1280
# One-hotting labels
inds = np.random.randint(0, NUM_LABLES-1, size=(NUM_EXAMPLES, 1500))
labels = np.zeros((NUM_EXAMPLES, 1500, NUM_LABLES))
i = np.arange(NUM_EXAMPLES)
j = np.arange(1500)
ii,jj = np.ix_(i,j)
labels[ii,jj,inds] = 1
# For simplicity we assume the images are flattened.
X = np.random.randn(NUM_EXAMPLES, 1500)
mlp = Sequential()
# Your model goes here
# Assume it ends with a Fully-Connected layer
mlp.add(Dense(512, input_shape=(X.shape[1], ), activation='relu'))
# Now we predict
mlp.add(Dense(1500 * NUM_LABLES))
mlp.add(Reshape((1500, 21)))
mlp.add(Activation('softmax'))
mlp.compile(Adam(), 'categorical_crossentropy')
the best way is to do convolutions (2D) of 21x1x1 kernel - i.e. hyper-column, for pixel wise segmentation, however, because of the shape issue with softmax (only) i'm getting: ' Exception: Cannot apply softmax to a tensor that is not 2D or 3D. Here, ndim=4 ' i've sent a mail to Francois Chollet (the Keras guy)...
if anyone knows how to tackle this please let me know... (ndor123@gmail.com)
till then, im doing sigmoid activation on the hyper-column kernel with mean square error as the loss function... only 90%... (softmax would have tweaked it for sure)
@ndor The solution has already been mentioned in the posts above, you just need to do an argmax
to the predicted probabilities and then reshape it to (img_col, img_row)
to get the pixelwise segmentation.
@KlaymenGC, flattening will produce a vector (WxHxClasses) with more than a single appearance of 1, which is not the way to classify with softmax...
Ok,
So i want to do pixelwise prediction and my output is (SAMPLES, H, W, Classes). Before that i do softmax on (SAMPLESxHxW, Classes) than reshape back to (SAMPLES, H, W, Classes).
The problem is with the loss function i think. With categorical_crossentropy i get for all pixels the same prediction...
Is that because what @EderSantana wrote? is there a fix or an open PR?
10x
@fchollet ... ? HELP please
so... after a good advice from a friend, i solved it with a Lambda() layer, as goes:
def depth_softmax(matrix):
sigmoid = lambda x: 1 / (1 + K.exp(-x))
sigmoided_matrix = sigmoid(matrix)
softmax_matrix = sigmoided_matrix / K.sum(sigmoided_matrix, axis=0)
return softmax_matrix
than you implement in the Lambda() layer:
model.add(Convolution2D(23, 1, 1, border_mode='same', W_regularizer=l1l2(l1=0.0001, l2=0.0001), b_regularizer=None, activity_regularizer=activity_l1l2(l1=0.0001, l2=0.0001)))
model.add(BatchNormalization())
# model.add(Activation('relu'))
model.add(Lambda(depth_softmax))
safe parsing y’all!
I don't think the depth_softmax
function code as defined by @ndor is correct. I believe this is a correct implementation
def depth_softmax(matrix, is_tensor=True):
# increase temperature to make the softmax more sure of itself
temp = 5.0
if is_tensor:
exp_matrix = K.exp(matrix*temp)
softmax_matrix = exp_matrix / K.sum(exp_matrix, axis=2, keepdims=True)
else:
exp_matrix = np.exp(matrix*temp)
softmax_matrix = exp_matrix / np.sum(exp_matrix, axis=2, keepdims=True)
return softmax_matrix
@rawmean Hi, thank you for your suggestion!
Can you please provide an example how to use that? Do you mean it needs to be passed as activation function?
Also, thank you for the temperature concept!
I am new to python ,keras and english, so welcome for your advice. here is how I process 3D data ,I change their label dimendion and I change the out put shape of the network then I try to compare the result of(sigmoid + binary_crossentrophy) and (softmax + categorical_crossentrophy) the first code is (sigmoid + binary_crossentrophy) the second is (softmax + categorical_crossentrophy) -------------------------------------------------first----------------------------------
from __future__ import print_function
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
import os
import keras
import PIL
from PIL import Image
from keras import Model, Input, optimizers
from keras.applications import vgg16, inception_v3, resnet50, mobilenet
from keras.layers import Conv2D,Lambda,Reshape
from keras.preprocessing.image import ImageDataGenerator, load_img
#数据预处理 data preprocessing
#下面将我的label从2284*30*40*1 转成2284*1200*14的onehot编码 trasfer my GT label from 2284*30*40*1 to 2284*1200*14
#2284是图片数量 2284 is the number of picture
#14是类别数量 14 is the number of category
#img和lab是你的图片和标注图片。 img is the array of you picture and lab is the array of your label
#img大小是2284*480*640*3 img.shape = 2284*480*640*3
#lab是2284*480*640 lab.shape = 2284*480*640
#trainval_list是你的训练和validation数据序号列表,因为2284张图片包含了900多张测试图片,我需要筛一下
#trainval_list is your list of train and validation.cause there are 900 pictures of test in 2284
img_trainval = img[trainval_list, :, :, :]
mini_lab = lab[:,::16,::16]
sum = np.zeros(shape=(2284, 1200, 14))
for i in range(2284):
pic_lab = mini_lab[i, :, :]
pic_flatten = np.reshape(pic_lab, (1, 1200))
pic_onehot = keras.utils.to_categorical(pic_flatten, 14)
sum[i] = pic_onehot
lab_trainval = sum[trainval_list, :, :]
#网络结构是非常简单的
#the structure os network is extremly simple
os.environ['CUDA_VISIBLE_DEVICES']='0'
resnet_model = resnet50.ResNet50(weights = 'imagenet', include_top=False,input_shape = (480,640,3))
layer_name = 'activation_40'
res16 = Model(inputs=resnet_model.input, outputs=resnet_model.get_layer(layer_name).output)
input_real = Input(shape=(480,640,3))
sgd = optimizers.SGD(lr=0.001, decay=1e-6, momentum=0.9, nesterov=True)
x = res16(input_real)
x = Conv2D(14, (1, 1), activation='relu')(x)
sig_out = Conv2D(14,(1,1),activation = 'sigmoid')(x)
out_reshape = Reshape((1200,14))(sig_out)
#配置训练参数
#compile parameter ,still a lot of things to learn
model_simple1 = Model(inputs=input_real, outputs=out_reshape)
model_simple1.summary()
model_simple1.compile(loss="binary_crossentropy", optimizer=sgd, metrics=['accuracy','categorical_accuracy'])
model_simple1.fit(x=img_trainval, y=lab_trainval, epochs=200, shuffle=True, batch_size=2)
and the structure of network is :
warnings.warn('The output shape of ResNet50(include_top=False)
'
input_2 (InputLayer) (None, 480, 640, 3) 0
model_1 (Model) (None, 30, 40, 1024) 8589184
conv2d_1 (Conv2D) (None, 30, 40, 14) 14350
conv2d_2 (Conv2D) (None, 30, 40, 14) 210
Total params: 8,603,744 Trainable params: 8,573,152 Non-trainable params: 30,592
the accuracy: Epoch 1/200 1370/1370 [==============================] - 224s 164ms/step - loss: 0.2772 - acc: 0.8955 - categorical_accuracy: 0.2184 Epoch 2/200 1370/1370 [==============================] - 218s 159ms/step - loss: 0.2113 - acc: 0.9281 - categorical_accuracy: 0.2910 --------------------------------------------------second----------------------------
from __future__ import print_function
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
import os
import keras
import PIL
from PIL import Image
from keras import Model, Input, optimizers
from keras.applications import vgg16, inception_v3, resnet50, mobilenet
from keras.layers import Conv2D,Lambda,Reshape
from keras.preprocessing.image import ImageDataGenerator, load_img
#数据预处理 data preprocessing
#下面将我的label从2284*30*40*1 转成2284*1200*14的onehot编码 trasfer my GT label from 2284*30*40*1 to 2284*1200*14
#2284是图片数量 2284 is the number of picture
#14是类别数量 14 is the number of category
#img和lab是你的图片和标注图片。 img is the array of you picture and lab is the array of your label
#img大小是2284*480*640*3 img.shape = 2284*480*640*3
#lab是2284*480*640 lab.shape = 2284*480*640
#trainval_list是你的训练和validation数据序号列表,因为2284张图片包含了900多张测试图片,我需要筛一下
#trainval_list is your list of train and validation.cause there are 900 pictures of test in 2284
img_trainval = img[trainval_list, :, :, :]
mini_lab = lab[:,::16,::16]
sum = np.zeros(shape=(2284, 1200, 14))
for i in range(2284):
pic_lab = mini_lab[i, :, :]
pic_flatten = np.reshape(pic_lab, (1, 1200))
pic_onehot = keras.utils.to_categorical(pic_flatten, 14)
sum[i] = pic_onehot
lab_trainval = sum[trainval_list, :, :]
#网络结构是非常简单的
#the structure os network is extremly simple
os.environ['CUDA_VISIBLE_DEVICES']='1'
resnet_model = resnet50.ResNet50(weights = 'imagenet', include_top=False,input_shape = (480,640,3))
layer_name = 'activation_40'
res16 = Model(inputs=resnet_model.input, outputs=resnet_model.get_layer(layer_name).output)
input_real = Input(shape=(480,640,3))
sgd = optimizers.SGD(lr=0.001, decay=1e-6, momentum=0.9, nesterov=True)
x = res16(input_real)
x = Conv2D(14, (1, 1), activation='relu')(x)
x = Conv2D(14, (1, 1), activation='softmax')(x)
out_reshape = Reshape((1200,14))(x)
#配置训练参数
# parameter configuration ,still a lot of things to learn
model_simple1 = Model(inputs=input_real, outputs=out_reshape)
model_simple1.summary()
model_simple1.compile(loss="categorical_crossentropy", optimizer=sgd, metrics=['accuracy','categorical_accuracy'])
model_simple1.fit(x=img_trainval, y=lab_trainval, epochs=200, shuffle=True, batch_size=2)
the struture of the second network:
warnings.warn('The output shape of ResNet50(include_top=False)
'
input_2 (InputLayer) (None, 480, 640, 3) 0
model_1 (Model) (None, 30, 40, 1024) 8589184
conv2d_1 (Conv2D) (None, 30, 40, 14) 14350
conv2d_2 (Conv2D) (None, 30, 40, 14) 210
Total params: 8,603,744 Trainable params: 8,573,152 Non-trainable params: 30,592
the accuracy:
Epoch 1/200 1370/1370 [==============================] - 239s 174ms/step - loss: 2.0305 - acc: 0.3117 - categorical_accuracy: 0.3117 Epoch 2/200
I didn't attach weight to each class though class 0 mean unlabeled in GT I didn't save my weight I didn't figure out the accuracy There might some error, I am not sure the way of numpy reshape is consistent with the Reshape layer, and I am not sure whether it matters. so far the training result shows that first code process fast but with low categorical accuracy the second is slow while more accurate
It seems now that you can simply do softmax
activation on the last Conv2D layer and then specify categorical_crossentropy
loss and train on the image without any reshaping tricks. I've tested with a dummy dataset and it works well. Try it ~ !
inp = keras.Input(...)
# define your model here
out = keras.layers.Conv2D(classes, (1, 1), activation='softmax') (...)
model = keras.Model(inputs=[inp], outputs=[out], name='unet')
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.fit(tensor4d, tensor4d)
You can also compile using sparse_categorical_crossentropy
and then train with output of shape (samples, height, width)
where each pixel in the output corresponds to a class label: model.fit(tensor4d, tensor3d)
PS. I use keras
from tensorflow.keras
(tensorflow 2)
Hello,
I'm building an end-to-end network to produce a pixelwise probability map of the input images. The input images (500x300x3) have pixelwise labels (500x300) indicating which class each pixel is belonging to. The data and label are defined as follows:
I want to use cross entropy as the loss, could anyone suggest a way of implement this using Keras? Thanks in advance!