solving multi label image classification using timedistributed dense layer

suraj-deshmukh commented 8 years ago

I have a multi label image dataset having 5 labels. Each image can have more than one label at the same time. I am using convolutional neural network to extract features and that extracted features i am giving it to RepeatVector layer to create 5 copies of extracted features and after RepeatVector layer i have connected TimeDistributed layer with dense = 2 i.e( TimeDistributed(Dense(2)) ). y_train is 3d array and its shape is (1600,5,2) and x_train is array of images. for eg

x_train.shape
(1600, 3, 100, 100)
y_train.shape 
(1600,5,2)
y_train[0] = 
array([[0, 1],   # [0,1] = 1 label present and  [1,0] = 0 label abset
       [1, 0],
       [1, 0],
       [1, 0],
       [1, 0]])

code:

    def get_label(y):
      tmp = []
      d = {0:[1,0],1:[0,1]}   # 0 absent 1 present
      for i,value in enumerate(y):
        tmp.append( d[value] )
      return tmp

    X,Y= get_data()
    Y = Y.tolist()
    y = []
    for value in Y:
      y.append(get_label(value))

    Y = np.array(y,dtype=int)

    x_train, x_test, y_train, y_test = train_test_split(X,Y,test_size =0.2,random_state=100)

    img_channels = 3
    img_rows     = 100
    img_cols     = 100
    nb_classes   = 5 

    model = Sequential()

    model.add(Convolution2D(32, 3, 3, border_mode='same',input_shape=(img_channels, img_rows, img_cols)))
    model.add(Activation('relu'))
    model.add(Convolution2D(32, 3, 3))
    model.add(Activation('relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Dropout(0.25))

    model.add(Convolution2D(64, 3, 3, border_mode='same'))
    model.add(Activation('relu'))
    model.add(Convolution2D(64, 3, 3))
    model.add(Activation('relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Dropout(0.25))

    model.add(Flatten())
    model.add(Dense(512))
    model.add(Activation('relu'))
    model.add(RepeatVector(nb_classes))
    model.add(TimeDistributed(Dense(2)))
    model.add(Activation('softmax'))

    # let's train the model using SGD + momentum (how original).
    # opt = RMSprop(lr=0.001, rho=0.9, epsilon=1e-06)
    opt = SGD(lr=0.01, momentum=0.0, decay=1e-6, nesterov=False)
    model.compile(loss='categorical_crossentropy', optimizer=opt,metrics=['accuracy'])
    model.fit(x_train,y_train,nb_epoch=10,batch_size=32,validation_data=(x_test,y_test),shuffle=True)
    out = model.predict_classes(x_test)

but after training i get all 0's for test set Is this approach is wrong?????

joelthchao commented 8 years ago

Why don't you make y a single value, 1: present, 0: absent. y.shape = (1600,5,1) IMHO, softmax is not neccessary, replace it with relu. Also, binary_crossentropy is better for multi-label problem.

suraj-deshmukh commented 8 years ago

@joelthchao I tried your solution but the problem is still not solved. For test set i am getting all zeros

joelthchao commented 8 years ago

@suraj-deshmukh How is you validation loss/accuracy? Make sure it is normal. Also, you can use predict instead of predict_classes to see the raw output.

esube commented 8 years ago

@suraj-deshmukh to add to what @joelthchao said above, for multi-label training, if your labels are balanced, don't require exotic custom loss function, and you don't need to use any sample based weighting, you can just make y 2D [n_samples x n_labels] with multi-label binary values indicating presence or absence of the label.

Then, just use simple Dense instead of the TimeDistributed(Dense) with 5 output neurons. Then, use binary_crossentropy. Also, use sigmoid on the output instead of softmax. Also, get rid of the RepeatVector layer. predict_classes doesn't make sense in this context, so use predict as @joelthchao suggested above.

Hope that helps.

suraj-deshmukh commented 8 years ago

@esube I already tried what you said and it is working fine. But I want to solve this problem with the technique explained in question.

joelthchao commented 8 years ago

@suraj-deshmukh Actually, the method provided by @esube is equivalent to using RepeatVector. Unless you want to have more Dense layers without sharing hidden units like CVPR'14 (Fig. 2).

suraj-deshmukh commented 8 years ago

@joelthchao So timedistributed layer does the same job like fig 2 of that paper. Correct me if i am wrong

joelthchao commented 8 years ago

@suraj-deshmukh Yes, exactly same thing.

alyato commented 8 years ago

@suraj-deshmukh ，I check the function of Timedistributed[http://keras.io/layers/core/#timedistributeddense]

Input shape 3D tensor with shape (nb_sample, time_dimension, input_dim). Output shape 3D tensor with shape (nb_sample, time_dimension, output_dim)

I think the TimeDistributed should be used to text,not to image. And how do you set the value of the time_dimension Do you also use it to image and how do you do it ? Thanks.

suraj-deshmukh commented 8 years ago

@alyato I think it depends on problem statement, surely we can apply TimeDistributed layer for multi label image classification. see Pose Aligned Networks for Deep Attribute Modeling fig 2

alyato commented 8 years ago

@suraj-deshmukh i see the paper and don't understand why use the layer of TimeDistributed. What does the TimeDistributed layer do ?

model.add(Flatten()) model.add(Dense(512)) model.add(Activation('relu')) model.add(RepeatVector(nb_classes)) model.add(TimeDistributed(Dense(2))) model.add(Activation('softmax'))

suraj-deshmukh commented 8 years ago

@alyato RepeatVector does job of repeating input feature vector into 'n' number of times and TimeDistributed(Dense(x)) does the job of applying fully connected single Dense layer for each vector created by RepeatVector as told earlier having 'x' number of neurons/classes

for eg lets say we have input feature vector with size 1 x 10 and after applying RepeatVector(10) it will become 10 x 10 matrix where each row contains same feature vector. Now TimeDistributed(Dense(2)) will apply single fully connected Dense(number of neurons equal to 2) layer to each row separately.

alyato commented 8 years ago

@suraj-deshmukh ,I little understand what you say.

model.add(Dense(512)) model.add(RepeatVector(7)) model.add(TimeDistributed(Dense(2)) model.add(Activation('softmax'))

But i don't understand the 'softmax'. Does it repeat depending on the RepeatVector? like:

Dense(512) |
| | ... |(totally 7) Dense(512) Dense(512) ...``Dense(512) Dense(2) Dense(2) ... Dense(2) softmax softmax ... softmax output output ... output

At last ,we can get 7 outputs? like multi-mask

suraj-deshmukh commented 8 years ago

@alyato yes, you are right

alyato commented 8 years ago

@suraj-deshmukh, I meet a weird problem and don't know how to do? I collect 1500 Medical images to classify the disease.(every pic is a single label) But why i train my model , the loss_function ,train_accuracy,valid_accuracy don't change every epoch. xxxx Do you give me some advice.Thanks.

suraj-deshmukh commented 8 years ago

@alyato
without looking into your code i cannot make comments. so can you share your code snippet??

alyato commented 8 years ago

@suraj-deshmukh yeah.Thanks.The dataset is 1500 person faces. Every image have one label.I want to input one person face to predict one of diseases.The labels totally have 9 diseases.Then the label is [0,1,2,3,4,5,6,7,8].0_label has 910 images, 1_label has 145 images ,.....8_label has 19 images
I select 1000 image as the train_dataset, the rest is to validate_dataset. The train_dataset: The validate_data: the code of My model: Thanks.

suraj-deshmukh commented 8 years ago

@alyato try with optimizer=sgd without single/double quoate. you created the object of optimizer but didn't passed it to the compile method.

rakashi commented 6 years ago

I am doing the classifying image. I have to get the confidence of each and every object in the image independently. By using softmax we can classify, there sum of all probabilities is equal to one. But I need to get the probabilities independently l If I am giving the image I need to get the probability independently like this 33444221-af71c9fe-d61f-11e7-9de0-66241dfc5a49

keras-team / keras

solving multi label image classification using timedistributed dense layer #2542