dluvizon / pose-regression

2D human pose regression
MIT License
73 stars 20 forks source link

Training and Dataset preparation #3

Closed SaifAlDilaimi closed 6 years ago

SaifAlDilaimi commented 6 years ago

Hey,

I'm currently trying to train also a net with the LSP Dataset. Is it maybe possible to share your code how you prepared the dataset and trained it? In this repository I just found the model definition but not how to train.

Would love to hear back!

SaifAlDilaimi commented 6 years ago

@dluvizon would love to hear from you.

dluvizon commented 6 years ago

Hi @SaifAlDilaimi ,

I did not provide the training code here because this project is still under development (in a private repo), but it will be available soon. Despite that, on the paper you can find all the details for training.

The procedure for LSP and MPII are very close, basically all you need to do it to crop the images using ground truth bounding boxes an do some data augmentation on it (see section 4.1 from the paper).

All the best,

SaifAlDilaimi commented 6 years ago

hey @dluvizon ,

thank you for the answer!

I hope I'm not rude if I ask some specific things:

Currently I'm working on my bsc thesis and I would love to cite your work if its ok..

dluvizon commented 6 years ago

Hi, my answers:

1) For training on LSP, you can use its 14 joints with two more "invalid" joints, on which you always set zero loss error. 2) In LSP, images are already cropped, so just use it as it is (at most, crop it squared). 3) This is a affine transformation: https://en.wikipedia.org/wiki/File:2D_affine_transformation_matrix.svg Just do the same for images and 2D points.

Of course it is OK citing. BTW, in this paper I extended the work, if you are interested: https://arxiv.org/abs/1802.09232

SaifAlDilaimi commented 6 years ago

Hey @dluvizon ,

thank you for the help! I'm currently trying to use your network model but I'm getting error which I can't pinpoint where it comes from.

Dimensions must be equal, but are 18 and 32 for 'Agg/model_1/dense_1/MatMul' (op: 'MatMul') with input shapes:

can you give me a hint?

dluvizon commented 6 years ago

Unfortunately I cannot help you with this single line of error. May is it related to so change that you did?

SaifAlDilaimi commented 6 years ago

Hey @dluvizon ,

Sorry for my last post.. I had a wrong implementation. Still, I got a new error which I can't fix. I have prepared the MPII Dataset as you said. I have copied your Network into a new file called DeepPoseNetwork and used it as follows:

def train(x_train, y_train, x_val, y_val, x_test, y_test, model_type=PreTrainedModel.CUSTOM):
    model = build_model()

    print("Shape of x_train: ", x_train.shape, ", of type: ", x_train.dtype)
    print("Shape of y_train: ", y_train[0].shape, ", of type: ", y_train.dtype)
    print("Shape of x_val: ", x_val.shape, ", of type: ", x_val.dtype)
    print("Shape of y_val: ", y_val.shape, ", of type: ", y_val.dtype)
    print("Shape of x_test: ", x_test.shape, ", of type: ", x_test.dtype)
    print("Shape of y_test: ", y_test.shape, ", of type: ", y_test.dtype)

    # checkpoint
    filepath="weights_"+model_type.name+".best.hdf5"
    reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.4, patience=5, min_lr=0.001)
    checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
    tensorboard = TensorBoard(log_dir="logs/{}".format(time()), histogram_freq=0, write_graph=True, write_images=False)
    csv_logger = CSVLogger('log_'+model_type.name+'.csv', append=True, separator=';')

    callbacks_list = [reduce_lr, checkpoint, tensorboard, csv_logger]

    model.fit(x_train, y_train,
            batch_size=PARAMS.ML_BATCH_SIZE,
            epochs=PARAMS.ML_EPOCHS,
            verbose=1,
            validation_data=(x_val, y_val), callbacks=callbacks_list)
    score = model.evaluate(x_test, y_test, verbose=0)
    print('Test loss:', score[0])
    print('Test accuracy:', score[1])

def build_model():
    if K.image_data_format() == 'channels_first':
        input_shape = (3, PARAMS.ML_INPUT_IMAGE_HEIGHT, PARAMS.ML_INPUT_IMAGE_WIDTH)
    else:
        input_shape = (PARAMS.ML_INPUT_IMAGE_WIDTH, PARAMS.ML_INPUT_IMAGE_HEIGHT, 3)

    m_input = Input(shape=input_shape)
    m_outputs = []

    num_heatmaps = (PARAMS.ML_LABEL_SIZE + 1) * PARAMS.ML_LABEL_CLASSES

    x = DeepPoseNetwork.stem(m_input)

    num_rows, num_cols, num_filters = K.int_shape(x)[1:]

    # Build the soft-argmax models (no parameters) for specialized and
    # contextual maps.
    sams_input_shape = (num_rows, num_cols, PARAMS.ML_LABEL_CLASSES)
    sam_s_model = DeepPoseNetwork.build_softargmax_2d(sams_input_shape, name='sSAM')
    jprob_s_model = DeepPoseNetwork.build_joints_probability(sams_input_shape, name='sjProb')

    # Build the aggregation model (no parameters)
    if PARAMS.ML_LABEL_SIZE > 0:
        samc_input_shape = (num_rows, num_cols, num_heatmaps - PARAMS.ML_LABEL_CLASSES)
        sam_c_model = DeepPoseNetwork.build_softargmax_2d(samc_input_shape, name='cSAM')
        jprob_c_model = DeepPoseNetwork.build_joints_probability(samc_input_shape, name='cjProb')
        agg_model = DeepPoseNetwork.build_context_aggregation(PARAMS.ML_LABEL_CLASSES, PARAMS.ML_LABEL_SIZE, PARAMS.ML_DEEPPOSE_ALPHA, name='Agg')

    for bidx in range(PARAMS.ML_DEEPPOSE_STAGES):
        block_shape = K.int_shape(x)[1:]
        x = DeepPoseNetwork.build_reception_block(x, name='rBlock%d' % (bidx + 1), ksize=PARAMS.ML_DEEPPOSE_KERNEL_SIZE)

        ident_map = x
        x = DeepPoseNetwork.build_sconv_block(x, name='SepConv%d' % (bidx + 1), ksize=PARAMS.ML_DEEPPOSE_KERNEL_SIZE)
        h = DeepPoseNetwork.build_regmap_block(x, num_heatmaps, name='RegMap%d' % (bidx + 1))

        if PARAMS.ML_LABEL_SIZE > 0:
            pose, visible, hm = DeepPoseNetwork.pose_regression_context(h, PARAMS.ML_LABEL_CLASSES, sam_s_model, sam_c_model, jprob_c_model, agg_model, jprob_s_model)
        else:
            pose, visible, hm = DeepPoseNetwork.pose_regression(h, sam_s_model, jprob_s_model)

        m_outputs.append(pose)
        m_outputs.append(visible)
        if PARAMS.ML_DEEPPOSE_EXPORT_HEATMAPS:
            m_outputs.append(hm)

        if bidx < PARAMS.ML_DEEPPOSE_STAGES - 1:
            h = DeepPoseNetwork.build_fremap_block(h, block_shape[-1], name='fReMap%d' % (bidx + 1))
            x = add([ident_map, x, h])

    model = Model(inputs=m_input, outputs=m_outputs)

    rmsprop = RMSprop(lr=PARAMS.ML_LEARN_RATE)

    model.compile(rmsprop, loss='binary_crossentropy', metrics=['accuracy'])

    model.summary()

    return model

def main():
    types = [PreTrainedModel.DEEPPOSE]

    (x_train, y_train), (x_val, y_val), (x_test, y_test) = Dataset(dstype=DatasetType.MPII).prepare_sets()

    print('Starting training...')
    train(x_train, y_train, x_val, y_val, x_test, y_val)

Calling this class generates the following numpy arrays with shape:

Shape of x_train:  (15935, 256, 256, 3) , of type:  int8
Shape of y_train:  (16, 2) , of type:  int32
Shape of x_val:  (1992, 256, 256, 3) , of type:  int8
Shape of y_val:  (1992, 16, 2) , of type:  int32
Shape of x_test:  (1992, 256, 256, 3) , of type:  int8
Shape of y_test:  (1992, 16, 2) , of type:  int32

After Keras displaying the networks summary it throws following error:

ValueError: Error when checking model target: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 16 array(s), but instead got the following list of 1 arrays: [array([[[219, 147],
        [157, 131],
        [103, 156],
        ...,
        [ 59, 134],
        [106, 125],
        [145, 102]],

       [[228, 123],
        [187, 123],
        [146, 120],
    ...

The summary of the Keras Model prints this:

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
input_1 (InputLayer)            (None, 256, 256, 3)  0
__________________________________________________________________________________________________
Stem (Model)                    (None, 32, 32, 576)  1039488     input_1[0][0]
__________________________________________________________________________________________________
rBlock1 (Model)                 (None, 32, 32, 576)  1312128     Stem[1][0]
__________________________________________________________________________________________________
SepConv1 (Model)                (None, 32, 32, 576)  347904      rBlock1[1][0]
__________________________________________________________________________________________________
RegMap1 (Model)                 (None, 32, 32, 48)   27648       SepConv1[1][0]
__________________________________________________________________________________________________
fReMap1 (Model)                 (None, 32, 32, 576)  29376       RegMap1[1][0]
__________________________________________________________________________________________________
add_12 (Add)                    (None, 32, 32, 576)  0           rBlock1[1][0]
                                                                 SepConv1[1][0]
                                                                 fReMap1[1][0]
__________________________________________________________________________________________________
rBlock2 (Model)                 (None, 32, 32, 576)  1312128     add_12[0][0]
__________________________________________________________________________________________________
SepConv2 (Model)                (None, 32, 32, 576)  347904      rBlock2[1][0]
__________________________________________________________________________________________________
RegMap2 (Model)                 (None, 32, 32, 48)   27648       SepConv2[1][0]
__________________________________________________________________________________________________
fReMap2 (Model)                 (None, 32, 32, 576)  29376       RegMap2[1][0]
__________________________________________________________________________________________________
add_22 (Add)                    (None, 32, 32, 576)  0           rBlock2[1][0]
                                                                 SepConv2[1][0]
                                                                 fReMap2[1][0]
__________________________________________________________________________________________________
rBlock3 (Model)                 (None, 32, 32, 576)  1312128     add_22[0][0]
__________________________________________________________________________________________________
SepConv3 (Model)                (None, 32, 32, 576)  347904      rBlock3[1][0]
__________________________________________________________________________________________________
RegMap3 (Model)                 (None, 32, 32, 48)   27648       SepConv3[1][0]
__________________________________________________________________________________________________
fReMap3 (Model)                 (None, 32, 32, 576)  29376       RegMap3[1][0]
__________________________________________________________________________________________________
add_32 (Add)                    (None, 32, 32, 576)  0           rBlock3[1][0]
                                                                 SepConv3[1][0]
                                                                 fReMap3[1][0]
__________________________________________________________________________________________________
rBlock4 (Model)                 (None, 32, 32, 576)  1312128     add_32[0][0]
__________________________________________________________________________________________________
SepConv4 (Model)                (None, 32, 32, 576)  347904      rBlock4[1][0]
__________________________________________________________________________________________________
RegMap4 (Model)                 (None, 32, 32, 48)   27648       SepConv4[1][0]
__________________________________________________________________________________________________
fReMap4 (Model)                 (None, 32, 32, 576)  29376       RegMap4[1][0]
__________________________________________________________________________________________________
add_42 (Add)                    (None, 32, 32, 576)  0           rBlock4[1][0]
                                                                 SepConv4[1][0]
                                                                 fReMap4[1][0]
__________________________________________________________________________________________________
rBlock5 (Model)                 (None, 32, 32, 576)  1312128     add_42[0][0]
__________________________________________________________________________________________________
SepConv5 (Model)                (None, 32, 32, 576)  347904      rBlock5[1][0]
__________________________________________________________________________________________________
RegMap5 (Model)                 (None, 32, 32, 48)   27648       SepConv5[1][0]
__________________________________________________________________________________________________
fReMap5 (Model)                 (None, 32, 32, 576)  29376       RegMap5[1][0]
__________________________________________________________________________________________________
add_52 (Add)                    (None, 32, 32, 576)  0           rBlock5[1][0]
                                                                 SepConv5[1][0]
                                                                 fReMap5[1][0]
__________________________________________________________________________________________________
rBlock6 (Model)                 (None, 32, 32, 576)  1312128     add_52[0][0]
__________________________________________________________________________________________________
SepConv6 (Model)                (None, 32, 32, 576)  347904      rBlock6[1][0]
__________________________________________________________________________________________________
RegMap6 (Model)                 (None, 32, 32, 48)   27648       SepConv6[1][0]
__________________________________________________________________________________________________
fReMap6 (Model)                 (None, 32, 32, 576)  29376       RegMap6[1][0]
__________________________________________________________________________________________________
add_62 (Add)                    (None, 32, 32, 576)  0           rBlock6[1][0]
                                                                 SepConv6[1][0]
                                                                 fReMap6[1][0]
__________________________________________________________________________________________________
rBlock7 (Model)                 (None, 32, 32, 576)  1312128     add_62[0][0]
__________________________________________________________________________________________________
SepConv7 (Model)                (None, 32, 32, 576)  347904      rBlock7[1][0]
__________________________________________________________________________________________________
RegMap7 (Model)                 (None, 32, 32, 48)   27648       SepConv7[1][0]
__________________________________________________________________________________________________
fReMap7 (Model)                 (None, 32, 32, 576)  29376       RegMap7[1][0]
__________________________________________________________________________________________________
add_72 (Add)                    (None, 32, 32, 576)  0           rBlock7[1][0]
                                                                 SepConv7[1][0]
                                                                 fReMap7[1][0]
__________________________________________________________________________________________________
rBlock8 (Model)                 (None, 32, 32, 576)  1312128     add_72[0][0]
__________________________________________________________________________________________________
SepConv8 (Model)                (None, 32, 32, 576)  347904      rBlock8[1][0]
__________________________________________________________________________________________________
RegMap8 (Model)                 (None, 32, 32, 48)   27648       SepConv8[1][0]
__________________________________________________________________________________________________
lambda_27 (Lambda)              (None, 32, 32, 16)   0           RegMap1[1][0]
__________________________________________________________________________________________________
lambda_28 (Lambda)              (None, 32, 32, 32)   0           RegMap1[1][0]
__________________________________________________________________________________________________
lambda_29 (Lambda)              (None, 32, 32, 16)   0           RegMap2[1][0]
__________________________________________________________________________________________________
lambda_30 (Lambda)              (None, 32, 32, 32)   0           RegMap2[1][0]
__________________________________________________________________________________________________
lambda_31 (Lambda)              (None, 32, 32, 16)   0           RegMap3[1][0]
__________________________________________________________________________________________________
lambda_32 (Lambda)              (None, 32, 32, 32)   0           RegMap3[1][0]
__________________________________________________________________________________________________
lambda_33 (Lambda)              (None, 32, 32, 16)   0           RegMap4[1][0]
__________________________________________________________________________________________________
lambda_34 (Lambda)              (None, 32, 32, 32)   0           RegMap4[1][0]
__________________________________________________________________________________________________
lambda_35 (Lambda)              (None, 32, 32, 16)   0           RegMap5[1][0]
__________________________________________________________________________________________________
lambda_36 (Lambda)              (None, 32, 32, 32)   0           RegMap5[1][0]
__________________________________________________________________________________________________
lambda_37 (Lambda)              (None, 32, 32, 16)   0           RegMap6[1][0]
__________________________________________________________________________________________________
lambda_38 (Lambda)              (None, 32, 32, 32)   0           RegMap6[1][0]
__________________________________________________________________________________________________
lambda_39 (Lambda)              (None, 32, 32, 16)   0           RegMap7[1][0]
__________________________________________________________________________________________________
lambda_40 (Lambda)              (None, 32, 32, 32)   0           RegMap7[1][0]
__________________________________________________________________________________________________
lambda_41 (Lambda)              (None, 32, 32, 16)   0           RegMap8[1][0]
__________________________________________________________________________________________________
lambda_42 (Lambda)              (None, 32, 32, 32)   0           RegMap8[1][0]
__________________________________________________________________________________________________
sSAM (Model)                    (None, 16, 2)        33280       lambda_27[0][0]
                                                                 lambda_29[0][0]
                                                                 lambda_31[0][0]
                                                                 lambda_33[0][0]
                                                                 lambda_35[0][0]
                                                                 lambda_37[0][0]
                                                                 lambda_39[0][0]
                                                                 lambda_41[0][0]
__________________________________________________________________________________________________
cSAM (Model)                    (None, 32, 2)        67584       lambda_28[0][0]
                                                                 lambda_30[0][0]
                                                                 lambda_32[0][0]
                                                                 lambda_34[0][0]
                                                                 lambda_36[0][0]
                                                                 lambda_38[0][0]
                                                                 lambda_40[0][0]
                                                                 lambda_42[0][0]
__________________________________________________________________________________________________
cjProb (Model)                  (None, 32, 1)        0           lambda_28[0][0]
                                                                 lambda_30[0][0]
                                                                 lambda_32[0][0]
                                                                 lambda_34[0][0]
                                                                 lambda_36[0][0]
                                                                 lambda_38[0][0]
                                                                 lambda_40[0][0]
                                                                 lambda_42[0][0]
__________________________________________________________________________________________________
Agg (Model)                     (None, 16, 2)        512         sSAM[1][0]
                                                                 cSAM[1][0]
                                                                 cjProb[1][0]
                                                                 sSAM[2][0]
                                                                 cSAM[2][0]
                                                                 cjProb[2][0]
                                                                 sSAM[3][0]
                                                                 cSAM[3][0]
                                                                 cjProb[3][0]
                                                                 sSAM[4][0]
                                                                 cSAM[4][0]
                                                                 cjProb[4][0]
                                                                 sSAM[5][0]
                                                                 cSAM[5][0]
                                                                 cjProb[5][0]
                                                                 sSAM[6][0]
                                                                 cSAM[6][0]
                                                                 cjProb[6][0]
                                                                 sSAM[7][0]
                                                                 cSAM[7][0]
                                                                 cjProb[7][0]
                                                                 sSAM[8][0]
                                                                 cSAM[8][0]
                                                                 cjProb[8][0]
__________________________________________________________________________________________________
sjProb (Model)                  (None, 16, 1)        0           lambda_27[0][0]
                                                                 lambda_29[0][0]
                                                                 lambda_31[0][0]
                                                                 lambda_33[0][0]
                                                                 lambda_35[0][0]
                                                                 lambda_37[0][0]
                                                                 lambda_39[0][0]
                                                                 lambda_41[0][0]
==================================================================================================
Total params: 14,847,936
Trainable params: 14,669,952
Non-trainable params: 177,984
__________________________________________________________________________________________________

Do you have any idea what may be the issue? Is the label shape wrong?

SaifAlDilaimi commented 6 years ago

@dluvizon I have tried every possible shape but its not working. Do you have an idea?

dluvizon commented 6 years ago

Hi @SaifAlDilaimi , It seems to me that you designed your model with 16 outputs, but you only passed 1 array for training. In other words, if your model has intermediate supervisions (which is the case of my original model), you'll need to replicate the labels by the same amount of outputs that you have.

SaifAlDilaimi commented 6 years ago

Hey @dluvizon ,

thanks for the hint! I have reshaped my labels to correlate with the output length of your model. Unfornatley it still dont work :( Using stages/supervisions of 6 (with export heatmaps set to False) I'm feeding the model following data:

Input shape:  (?, 256, 256, 3)
Output length:  12
Shape of x_train:  (15935, 256, 256, 3) , of type:  int8
Shape of y_train:  (15935, 16, 2) , of type:  int32
Shape of x_val:  (1992, 256, 256, 3) , of type:  int8
Shape of y_val:  (1992, 16, 2) , of type:  int32
Shape of x_test:  (1992, 256, 256, 3) , of type:  int8
Shape of y_test:  (1992, 16, 2) , of type:  int32
Reshaping label to correlate with deep supervision of stages
Shape of y_train:  (12, 15935, 16, 2) , of type:  int32
Shape of y_val:  (12, 1992, 16, 2) , of type:  int32
Shape of y_test:  (12, 1992, 16, 2) , of type:  int32

I'm reading the Model.output property and building a np.array containing the same y_label Model.output times. Still I'm getting the same error...

ValueError: Error when checking model target: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 12 array(s), but instead got the following list of 1 arrays: [array([[[[219, 147],
         [157, 131],
         [103, 156],
         ...,
         [ 59, 134],
         [106, 125],
         [145, 102]],

        [[228, 123],
         [187, 123],
         [146, ...

I'm feeding arrays containing labels of length 12. Why does it still says it receives array of length 1??

SaifAlDilaimi commented 6 years ago

@dluvizon I think I found something: following the traceback I found this line in training.py of keras:

https://github.com/keras-team/keras/blob/af804d0a5db9a8f20fbb083b48655b2687ce89d9/keras/engine/training.py#L78-L87

It compares if the input names are as many as the label arrays.. "names: List of expected array names."

Idk why that error is even thrown.. Do you have an idea?

dluvizon commented 6 years ago

I still think that the problem is related to the labels that you are passing for training:

Expected to see 12 array(s), but instead got the following list of 1 arrays:

You should pass 12 arrays for training, not one array of size 12.

SaifAlDilaimi commented 6 years ago

Well sure I'm passing an array containing 12 arrays where each elements are my labels...

Reshaping label to correlate with deep supervision of stages
Shape of y_train:  (12, 15935, 16, 2) , of type:  int32
Shape of y_val:  (12, 1992, 16, 2) , of type:  int32
Shape of y_test:  (12, 1992, 16, 2) , of type:  int32

How can I pass 12 arrays directly? Currently I'm doing it like this:

def train(x_train, y_train, x_val, y_val, x_test, y_test, model_type=PreTrainedModel.CUSTOM):
    model = build_model()

    print("Shape of x_train: ", x_train.shape, ", of type: ", x_train.dtype)
    print("Shape of y_train: ", y_train.shape, ", of type: ", y_train.dtype)
    print("Shape of x_val: ", x_val.shape, ", of type: ", x_val.dtype)
    print("Shape of y_val: ", y_val.shape, ", of type: ", y_val.dtype)
    print("Shape of x_test: ", x_test.shape, ", of type: ", x_test.dtype)
    print("Shape of y_test: ", y_test.shape, ", of type: ", y_test.dtype)

    # checkpoint
    filepath="weights_"+model_type.name+".best.hdf5"
    reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.4, patience=5, min_lr=0.001)
    checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
    tensorboard = TensorBoard(log_dir="logs/{}".format(time()), histogram_freq=0, write_graph=True, write_images=False)
    csv_logger = CSVLogger('log_'+model_type.name+'.csv', append=True, separator=';')

    callbacks_list = [reduce_lr, checkpoint, tensorboard, csv_logger]

    # Modify output to correlate with stages
    print("Reshaping label to correlate with deep supervision of stages")
    out_y_train, out_y_val, out_y_test = [], [], []

    for i in range(0, len(model.output)):
        out_y_train.append(y_train)
        out_y_val.append(y_val)
        out_y_test.append(y_test)

    y_train = np.array(out_y_train)
    y_val = np.array(out_y_val)
    y_test = np.array(out_y_test)

    print("Types: ", all(isinstance(x, np.ndarray) for x in y_train[0][0][0]))

    print("Shape of y_train: ", y_train.shape, ", of type: ", y_train.dtype)
    print("Shape of y_val: ", y_val.shape, ", of type: ", y_val.dtype)
    print("Shape of y_test: ", y_test.shape, ", of type: ", y_test.dtype)

    model.fit(x_train, y_train,
            batch_size=PARAMS.ML_BATCH_SIZE,
            epochs=PARAMS.ML_EPOCHS,
            verbose=1,
            validation_data=(x_val, y_val), callbacks=callbacks_list)
    score = model.evaluate(x_test, y_test, verbose=0)
    print('Test loss:', score[0])
    print('Test accuracy:', score[1])
SaifAlDilaimi commented 6 years ago

@dluvizon I'm one step further! You were right that my arrays were not formatted correctly.. Although now the model throws a error that seems to be linked to the joints probability layer sjProb:

Traceback (most recent call last):
  File "Train.py", line 154, in <module>
    main()
  File "Train.py", line 151, in main
    train(x_train, y_train, x_val, y_val, x_test, y_val)
  File "Train.py", line 76, in train
    validation_data=(x_val, out_y_val), callbacks=callbacks_list)
  File "C:\Users\saifa\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\engine\training.py", line 1630, in fit
    batch_size=batch_size)
  File "C:\Users\saifa\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\engine\training.py", line 1480, in _standardize_user_data
    exception_prefix='target')
  File "C:\Users\saifa\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\engine\training.py", line 123, in _standardize_input_data
    str(data_shape))
ValueError: Error when checking target: expected sjProb to have shape (16, 1) but got array with shape (16, 2)

Where do I add this array and what should be its content? Thank you so much for the help!

SaifAlDilaimi commented 6 years ago

@dluvizon can you give me any hints?

dluvizon commented 6 years ago

Hi,

ValueError: Error when checking target: expected sjProb to have shape (16, 1) but got array with shape (16, 2)

You have to provide joint probabilities in the right shape (16,1).

SaifAlDilaimi commented 6 years ago

Can you elloborate this more? where do I get the joint probabilities? @dluvizon

dluvizon commented 6 years ago

If you want to supervise joint probabilities (section 3.2.2 in the paper), you need to provide the labels for that in the shape (16,1). Basically, it is 1 if the joint if available, 0 otherwise.

SaifAlDilaimi commented 6 years ago

okay ty, but I'm calling keras fit with a list containing 12x arrays of my joints location of shape (15935, 16, 2) + 1 array that contains 16x 1 [[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1]]. I build those like this:

def train(x_train, y_train, x_val, y_val, x_test, y_test):
    # x,y are in following shapes passed
    # Input shape:  (?, 256, 256, 3)
    # Output length:  12
    # Shape of x_train:  (15935, 256, 256, 3) , of type:  int8
    # Shape of y_train:  (15935, 16, 2) , of type:  int32
    # Shape of x_val:  (1992, 256, 256, 3) , of type:  int8
    # Shape of y_val:  (1992, 16, 2) , of type:  int32
    # Shape of x_test:  (1992, 256, 256, 3) , of type:  int8
    # Shape of y_test:  (1992, 16, 2) , of type:  int32

    model = build_model()
    out_y_train, out_y_val, out_y_test = [], [], []

    for i in range(0, len(model.output) - 1):
        out_y_train.append(y_train)
        out_y_val.append(y_val)
        out_y_test.append(y_test)

    # append label array for joint probability 
    y_sjprob_train = np.array([[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1]])
    out_y_train.append(y_sjprob_train)
    y_sjprob_test = np.array([[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1]])
    out_y_test.append(y_sjprob_test)
    y_sjprob_val = np.array([[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1]])
    out_y_val.append(y_sjprob_val)

    model.fit(x_train, out_y_train,
            batch_size=PARAMS.ML_BATCH_SIZE,
            epochs=PARAMS.ML_EPOCHS,
            verbose=1,
            validation_data=(x_val, out_y_val), callbacks=callbacks_list)

I'm still getting the same error @dluvizon :(

  File "Train.py", line 153, in <module>
    main()
  File "Train.py", line 150, in main
    train(x_train, y_train, x_val, y_val, x_test, y_val)
  File "Train.py", line 75, in train
    validation_data=(x_val, out_y_val), callbacks=callbacks_list)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/keras/engine/training.py", line 1630, in fit
    batch_size=batch_size)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/keras/engine/training.py", line 1480, in _standardize_user_data
    exception_prefix='target')
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/keras/engine/training.py", line 123, in _standardize_input_data
    str(data_shape))
ValueError: Error when checking target: expected sjProb to have shape (16, 1) but got array with shape (16, 2)

Can you give me an example how the training data should look and how I pass the training data for the sjProb layer? Any help is highly apreciated

SaifAlDilaimi commented 6 years ago

hey @dluvizon ,

after checking which outputs the network generates and specialy in which order:

Output:  Tensor("Agg/add_2/add:0", shape=(?, 16, 2), dtype=float32)
Output:  Tensor("sjProb/lambda_9/ExpandDims:0", shape=(?, 16, 1), dtype=float32)
Output:  Tensor("Agg_1/add_2/add:0", shape=(?, 16, 2), dtype=float32)
Output:  Tensor("sjProb_1/lambda_9/ExpandDims:0", shape=(?, 16, 1), dtype=float32)
Output:  Tensor("Agg_2/add_2/add:0", shape=(?, 16, 2), dtype=float32)
Output:  Tensor("sjProb_2/lambda_9/ExpandDims:0", shape=(?, 16, 1), dtype=float32)
Output:  Tensor("Agg_3/add_2/add:0", shape=(?, 16, 2), dtype=float32)
Output:  Tensor("sjProb_3/lambda_9/ExpandDims:0", shape=(?, 16, 1), dtype=float32)
Output:  Tensor("Agg_4/add_2/add:0", shape=(?, 16, 2), dtype=float32)
Output:  Tensor("sjProb_4/lambda_9/ExpandDims:0", shape=(?, 16, 1), dtype=float32)
Output:  Tensor("Agg_5/add_2/add:0", shape=(?, 16, 2), dtype=float32)
Output:  Tensor("sjProb_5/lambda_9/ExpandDims:0", shape=(?, 16, 1), dtype=float32)

So I have build my labels to contain always the joint probability array:

    # Modify output to correlate with stages
    print("Reshaping label to correlate with deep supervision of stages")
    out_y_train, out_y_val, out_y_test = [], [], []
    sjprob_init = [[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1],[1]]

    for i in range(0, len(model.output)):            
        # append label array for joint probability 
        out_y_train.append(np.array([y_train, sjprob_init]))
        out_y_test.append(np.array([y_test, sjprob_init]))
        out_y_val.append(np.array([y_val, sjprob_init]))

unfornatley I have now another error that belongs to the context aggregation layer. I'm getting now this error:

Traceback (most recent call last):
  File "Train.py", line 146, in <module>
    main()
  File "Train.py", line 143, in main
    train(x_train, y_train, x_val, y_val, x_test, y_val)
  File "Train.py", line 68, in train
    validation_data=(x_val, out_y_val), callbacks=callbacks_list)
  File "C:\Users\saifa\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\engine\training.py", line 1630, in fit
    batch_size=batch_size)
  File "C:\Users\saifa\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\engine\training.py", line 1480, in _standardize_user_data
    exception_prefix='target')
  File "C:\Users\saifa\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\engine\training.py", line 113, in _standardize_input_data
    'with shape ' + str(data_shape))
ValueError: Error when checking target: expected Agg to have 3 dimensions, but got array with shape (2, 1)

Now I'm not sure what is needed here. My labels are of shape (16,2) for joints x,y and the joints probability (16,1) where 1 returns a probability (right?), 0 not. So what does the layer Agg need here? 3 dimension are mostly images... but I have read now your paper multiple times (section 3.2.3?) but I can't seem to find a reference to this.. Can you give me a hint to what data that layer expect and why this error is thrown?

dluvizon commented 6 years ago

Hi @SaifAlDilaimi , Please be sure that the errors here have been raised due to problems in my code. If you can compile the model (print the summary) and if you can pass one image through it, then the problem is due to the labels that you are providing for training. Unfortunately, I am not available to give you support to your own implementation.

Best,

SaifAlDilaimi commented 6 years ago

@dluvizon the problem is I'm not able to figure out what labels your models needs. If I know what data your models expects as labels I can provide that. It has not really something todo with my implementation. Its a general question.

dluvizon commented 6 years ago

But you already know that, as you posted here:

Output:  Tensor("Agg/add_2/add:0", shape=(?, 16, 2), dtype=float32)
Output:  Tensor("sjProb/lambda_9/ExpandDims:0", shape=(?, 16, 1), dtype=float32)
Output:  Tensor("Agg_1/add_2/add:0", shape=(?, 16, 2), dtype=float32)
Output:  Tensor("sjProb_1/lambda_9/ExpandDims:0", shape=(?, 16, 1), dtype=float32)
Output:  Tensor("Agg_2/add_2/add:0", shape=(?, 16, 2), dtype=float32)
Output:  Tensor("sjProb_2/lambda_9/ExpandDims:0", shape=(?, 16, 1), dtype=float32)
Output:  Tensor("Agg_3/add_2/add:0", shape=(?, 16, 2), dtype=float32)
Output:  Tensor("sjProb_3/lambda_9/ExpandDims:0", shape=(?, 16, 1), dtype=float32)
Output:  Tensor("Agg_4/add_2/add:0", shape=(?, 16, 2), dtype=float32)
Output:  Tensor("sjProb_4/lambda_9/ExpandDims:0", shape=(?, 16, 1), dtype=float32)
Output:  Tensor("Agg_5/add_2/add:0", shape=(?, 16, 2), dtype=float32)
Output:  Tensor("sjProb_5/lambda_9/ExpandDims:0", shape=(?, 16, 1), dtype=float32)

The model expects a tensor of shape (Batch_size, 16, 2) for poses and (Batch_size, 16, 1) for joint probabilities, which you can use a vector of ones, like you did. Then you have to provide the same amount of labels (in the right order) as the model is expecting. Finally, if you can produce these outputs by passing an image through the model you should be able to train it, right?

SaifAlDilaimi commented 6 years ago

@dluvizon sorry for the late response but its working and I'm able to train the model! I'm just curious because after 47/120 epochs the accuracy of all aggregation layers are the same.. its starts at acc of 0.65 and goes down to 0.44 for each epoch... here are some epochs results:

Epoch 44/120
 - 1782s - loss: 4634412.5771 - Agg_loss: 579301.5720 - sjProb_loss: 1.1921e-07 - Agg_acc: 0.4487 - sjProb_acc: 1.0000 - Agg_acc_1: 0.4487 - sjProb_acc_1: 1.0000 - Agg_acc_2: 0.4487 - sjProb_acc_2: 1.00
00 - Agg_acc_3: 0.4487 - sjProb_acc_3: 1.0000 - Agg_acc_4: 0.4487 - sjProb_acc_4: 1.0000 - Agg_acc_5: 0.4487 - sjProb_acc_5: 1.0000 - Agg_acc_6: 0.4487 - sjProb_acc_6: 1.0000 - Agg_acc_7: 0.4487 - sjPro
b_acc_7: 1.0000 - val_loss: 4631173.0040 - val_Agg_loss: 578896.6255 - val_sjProb_loss: 1.1921e-07 - val_Agg_acc: 0.4519 - val_sjProb_acc: 1.0000 - val_Agg_acc_1: 0.4519 - val_sjProb_acc_1: 1.0000 - val
_Agg_acc_2: 0.4519 - val_sjProb_acc_2: 1.0000 - val_Agg_acc_3: 0.4519 - val_sjProb_acc_3: 1.0000 - val_Agg_acc_4: 0.4519 - val_sjProb_acc_4: 1.0000 - val_Agg_acc_5: 0.4519 - val_sjProb_acc_5: 1.0000 - v
al_Agg_acc_6: 0.4519 - val_sjProb_acc_6: 1.0000 - val_Agg_acc_7: 0.4519 - val_sjProb_acc_7: 1.0000

Epoch 00044: val_Agg_acc did not improve
Epoch 45/120
 - 1781s - loss: 4634412.5756 - Agg_loss: 579301.5720 - sjProb_loss: 1.1921e-07 - Agg_acc: 0.4487 - sjProb_acc: 1.0000 - Agg_acc_1: 0.4487 - sjProb_acc_1: 1.0000 - Agg_acc_2: 0.4487 - sjProb_acc_2: 1.00
00 - Agg_acc_3: 0.4487 - sjProb_acc_3: 1.0000 - Agg_acc_4: 0.4487 - sjProb_acc_4: 1.0000 - Agg_acc_5: 0.4487 - sjProb_acc_5: 1.0000 - Agg_acc_6: 0.4487 - sjProb_acc_6: 1.0000 - Agg_acc_7: 0.4487 - sjPro
b_acc_7: 1.0000 - val_loss: 4631173.0040 - val_Agg_loss: 578896.6255 - val_sjProb_loss: 1.1921e-07 - val_Agg_acc: 0.4519 - val_sjProb_acc: 1.0000 - val_Agg_acc_1: 0.4519 - val_sjProb_acc_1: 1.0000 - val
_Agg_acc_2: 0.4519 - val_sjProb_acc_2: 1.0000 - val_Agg_acc_3: 0.4519 - val_sjProb_acc_3: 1.0000 - val_Agg_acc_4: 0.4519 - val_sjProb_acc_4: 1.0000 - val_Agg_acc_5: 0.4519 - val_sjProb_acc_5: 1.0000 - v
al_Agg_acc_6: 0.4519 - val_sjProb_acc_6: 1.0000 - val_Agg_acc_7: 0.4519 - val_sjProb_acc_7: 1.0000

Epoch 00045: val_Agg_acc did not improve
Epoch 46/120
 - 1782s - loss: 4634412.5766 - Agg_loss: 579301.5720 - sjProb_loss: 1.1921e-07 - Agg_acc: 0.4487 - sjProb_acc: 1.0000 - Agg_acc_1: 0.4487 - sjProb_acc_1: 1.0000 - Agg_acc_2: 0.4487 - sjProb_acc_2: 1.00
00 - Agg_acc_3: 0.4487 - sjProb_acc_3: 1.0000 - Agg_acc_4: 0.4487 - sjProb_acc_4: 1.0000 - Agg_acc_5: 0.4487 - sjProb_acc_5: 1.0000 - Agg_acc_6: 0.4487 - sjProb_acc_6: 1.0000 - Agg_acc_7: 0.4487 - sjPro
b_acc_7: 1.0000 - val_loss: 4631173.0040 - val_Agg_loss: 578896.6255 - val_sjProb_loss: 1.1921e-07 - val_Agg_acc: 0.4519 - val_sjProb_acc: 1.0000 - val_Agg_acc_1: 0.4519 - val_sjProb_acc_1: 1.0000 - val
_Agg_acc_2: 0.4519 - val_sjProb_acc_2: 1.0000 - val_Agg_acc_3: 0.4519 - val_sjProb_acc_3: 1.0000 - val_Agg_acc_4: 0.4519 - val_sjProb_acc_4: 1.0000 - val_Agg_acc_5: 0.4519 - val_sjProb_acc_5: 1.0000 - v
al_Agg_acc_6: 0.4519 - val_sjProb_acc_6: 1.0000 - val_Agg_acc_7: 0.4519 - val_sjProb_acc_7: 1.0000

Epoch 00046: val_Agg_acc did not improve
Epoch 47/120
 - 1780s - loss: 4634412.5786 - Agg_loss: 579301.5720 - sjProb_loss: 1.1921e-07 - Agg_acc: 0.4487 - sjProb_acc: 1.0000 - Agg_acc_1: 0.4487 - sjProb_acc_1: 1.0000 - Agg_acc_2: 0.4487 - sjProb_acc_2: 1.00
00 - Agg_acc_3: 0.4487 - sjProb_acc_3: 1.0000 - Agg_acc_4: 0.4487 - sjProb_acc_4: 1.0000 - Agg_acc_5: 0.4487 - sjProb_acc_5: 1.0000 - Agg_acc_6: 0.4487 - sjProb_acc_6: 1.0000 - Agg_acc_7: 0.4487 - sjPro
b_acc_7: 1.0000 - val_loss: 4631173.0040 - val_Agg_loss: 578896.6255 - val_sjProb_loss: 1.1921e-07 - val_Agg_acc: 0.4519 - val_sjProb_acc: 1.0000 - val_Agg_acc_1: 0.4519 - val_sjProb_acc_1: 1.0000 - val
_Agg_acc_2: 0.4519 - val_sjProb_acc_2: 1.0000 - val_Agg_acc_3: 0.4519 - val_sjProb_acc_3: 1.0000 - val_Agg_acc_4: 0.4519 - val_sjProb_acc_4: 1.0000 - val_Agg_acc_5: 0.4519 - val_sjProb_acc_5: 1.0000 - v
al_Agg_acc_6: 0.4519 - val_sjProb_acc_6: 1.0000 - val_Agg_acc_7: 0.4519 - val_sjProb_acc_7: 1.0000

Do you have any idea what may be the issue here? As you described in section 4.1 I have setup an array that contains the loss functions depending on the number of generated outputs. I will let it run till the end but it feels wrong...

SaifAlDilaimi commented 6 years ago

Hello @dluvizon , in your code you set a Flag to export heatmaps. On my training I didn't export heatmaps. Is this maybe the problem?

To test it I have tried to export heatmaps but with K = 8 I have to provide 8 labels in following shapes:

Tensor("lambda_27/strided_slice:0", shape=(?, 32, 32, 16), dtype=float32)
Tensor("lambda_29/strided_slice:0", shape=(?, 32, 32, 16), dtype=float32)
Tensor("lambda_31/strided_slice:0", shape=(?, 32, 32, 16), dtype=float32)
Tensor("lambda_33/strided_slice:0", shape=(?, 32, 32, 16), dtype=float32)
Tensor("lambda_35/strided_slice:0", shape=(?, 32, 32, 16), dtype=float32)
Tensor("lambda_37/strided_slice:0", shape=(?, 32, 32, 16), dtype=float32)
Tensor("lambda_39/strided_slice:0", shape=(?, 32, 32, 16), dtype=float32)
Tensor("lambda_41/strided_slice:0", shape=(?, 32, 32, 16), dtype=float32)

What do I need to provide here? Joints, but why shape (32,32,16)?

I know that I'm really stubborn but I really want to understand your work.. I hope you can help me...

dluvizon commented 6 years ago

Hi, The option to export heatmaps is only to visualize them after training.

BTW, there is something really wrong with your loss here:

loss: 4634412.5786

That value is too high.

SaifAlDilaimi commented 6 years ago

hey @dluvizon ,

after reading your paper for the x time this part (Section 4.1) stuck out: ... In this case, we use directly the joint coordinates normalized to the interval [0; 1], where the top-left image corner corresponds to (0; 0), and the bottom-right image corner corresponds to (1; 1).

My (x,y) coordinates were still in the range [0,IMAGE_WIDTH]. So I have normalized to the interval [0,1]. Now my output is this for the first 3 epochs:

Epoch 1/4
 - 1331s - loss: 26.3705 - Agg_loss: 3.2928 - sjProb_loss: 0.0019 - Agg_acc: 0.8034 - sjProb_acc: 1.0000 - Agg_acc_1: 0.8036 - sjProb_acc_1: 1.0000 - Agg_acc_2: 0.8040 - sjProb_acc_2: 1.0000 - Agg_acc_3: 0.8035 - sjProb_acc_3: 1.0000 - Agg_acc_4: 0.8034 - sjProb_acc_4: 1.0000 - Agg_acc_5: 0.8031 - sjProb_acc_5: 1.0000 - Agg_acc_6: 0.8029 - sjProb_acc_6: 1.0000 - Agg_acc_7: 0.8035 - sjProb_acc_7: 1.0000 - val_loss: 25.9747 - val_Agg_loss: 3.2464 - val_sjProb_loss: 1.1047e-04 - val_Agg_acc: 0.8051 - val_sjProb_acc: 1.0000 - val_Agg_acc_1: 0.8051 - val_sjProb_acc_1: 1.0000 - val_Agg_acc_2: 0.8056 - val_sjProb_acc_2: 1.0000 - val_Agg_acc_3: 0.8055 - val_sjProb_acc_3: 1.0000 - val_Agg_acc_4: 0.8052 - val_sjProb_acc_4: 1.0000 - val_Agg_acc_5: 0.8052 - val_sjProb_acc_5: 1.0000 - val_Agg_acc_6: 0.8053 - val_sjProb_acc_6: 1.0000 - val_Agg_acc_7: 0.8055 - val_sjProb_acc_7: 1.0000

Epoch 00001: val_Agg_acc improved from -inf to 0.80513, saving model to weights_CUSTOM.best.hdf5
Epoch 2/4
 - 1328s - loss: 26.0118 - Agg_loss: 3.2522 - sjProb_loss: 3.1062e-05 - Agg_acc: 0.8066 - sjProb_acc: 1.0000 - Agg_acc_1: 0.8065 - sjProb_acc_1: 1.0000 - Agg_acc_2: 0.8066 - sjProb_acc_2: 1.0000 - Agg_acc_3: 0.8062 - sjProb_acc_3: 1.0000 - Agg_acc_4: 0.8062 - sjProb_acc_4: 1.0000 - Agg_acc_5: 0.8063 - sjProb_acc_5: 1.0000 - Agg_acc_6: 0.8065 - sjProb_acc_6: 1.0000 - Agg_acc_7: 0.8061 - sjProb_acc_7: 1.0000 - val_loss: 25.8309 - val_Agg_loss: 3.2293 - val_sjProb_loss: 1.7252e-05 - val_Agg_acc: 0.8074 - val_sjProb_acc: 1.0000 - val_Agg_acc_1: 0.8076 - val_sjProb_acc_1: 1.0000 - val_Agg_acc_2: 0.8077 - val_sjProb_acc_2: 1.0000 - val_Agg_acc_3: 0.8064 - val_sjProb_acc_3: 1.0000 - val_Agg_acc_4: 0.8075 - val_sjProb_acc_4: 1.0000 - val_Agg_acc_5: 0.8073 - val_sjProb_acc_5: 1.0000 - val_Agg_acc_6: 0.8068 - val_sjProb_acc_6: 1.0000 - val_Agg_acc_7: 0.8062 - val_sjProb_acc_7: 1.0000

Epoch 00002: val_Agg_acc improved from 0.80513 to 0.80742, saving model to weights_CUSTOM.best.hdf5
Epoch 3/4
 - 1317s - loss: 25.8515 - Agg_loss: 3.2334 - sjProb_loss: 1.6235e-05 - Agg_acc: 0.8087 - sjProb_acc: 1.0000 - Agg_acc_1: 0.8096 - sjProb_acc_1: 1.0000 - Agg_acc_2: 0.8093 - sjProb_acc_2: 1.0000 - Agg_acc_3: 0.8084 - sjProb_acc_3: 1.0000 - Agg_acc_4: 0.8083 - sjProb_acc_4: 1.0000 - Agg_acc_5: 0.8083 - sjProb_acc_5: 1.0000 - Agg_acc_6: 0.8086 - sjProb_acc_6: 1.0000 - Agg_acc_7: 0.8080 - sjProb_acc_7: 1.0000 - val_loss: 25.8895 - val_Agg_loss: 3.2424 - val_sjProb_loss: 3.3734e-05 - val_Agg_acc: 0.8102 - val_sjProb_acc: 1.0000 - val_Agg_acc_1: 0.8083 - val_sjProb_acc_1: 1.0000 - val_Agg_acc_2: 0.8086 - val_sjProb_acc_2: 1.0000 - val_Agg_acc_3: 0.8094 - val_sjProb_acc_3: 1.0000 - val_Agg_acc_4: 0.8101 - val_sjProb_acc_4: 1.0000 - val_Agg_acc_5: 0.8085 - val_sjProb_acc_5: 1.0000 - val_Agg_acc_6: 0.8096 - val_sjProb_acc_6: 1.0000 - val_Agg_acc_7: 0.8082 - val_sjProb_acc_7: 1.0000

Epoch 00003: val_Agg_acc improved from 0.80742 to 0.81018, saving model to weights_CUSTOM.best.hdf5
Epoch 4/4
 - 1309s - loss: 25.6343 - Agg_loss: 3.2059 - sjProb_loss: 1.2252e-05 - Agg_acc: 0.8111 - sjProb_acc: 1.0000 - Agg_acc_1: 0.8122 - sjProb_acc_1: 1.0000 - Agg_acc_2: 0.8124 - sjProb_acc_2: 1.0000 - Agg_acc_3: 0.8116 - sjProb_acc_3: 1.0000 - Agg_acc_4: 0.8111 - sjProb_acc_4: 1.0000 - Agg_acc_5: 0.8113 - sjProb_acc_5: 1.0000 - Agg_acc_6: 0.8108 - sjProb_acc_6: 1.0000 - Agg_acc_7: 0.8113 - sjProb_acc_7: 1.0000 - val_loss: 25.5565 - val_Agg_loss: 3.1973 - val_sjProb_loss: 3.7608e-06 - val_Agg_acc: 0.8086 - val_sjProb_acc: 1.0000 - val_Agg_acc_1: 0.8100 - val_sjProb_acc_1: 1.0000 - val_Agg_acc_2: 0.8106 - val_sjProb_acc_2: 1.0000 - val_Agg_acc_3: 0.8104 - val_sjProb_acc_3: 1.0000 - val_Agg_acc_4: 0.8106 - val_sjProb_acc_4: 1.0000 - val_Agg_acc_5: 0.8106 - val_sjProb_acc_5: 1.0000 - val_Agg_acc_6: 0.8103 - val_sjProb_acc_6: 1.0000 - val_Agg_acc_7: 0.8086 - val_sjProb_acc_7: 1.0000

I'm I getting closer? The Agg_acc of 0.80 seems to good for first epoch. Can you just tell me if I'm getting closer or I'm I totally wrong?


In Section 4.2 you mention 4 different metrics: PCP, PCK, OC & PC. Can you maybe give me reference how to calculate those? In your measures.py I have found the method pckh(y_true, y_pred, head_size, refp=0.5) but Keras needs the parameter head_size to be set. What would be a valid number here?

dluvizon commented 6 years ago

Hi, I don't know how you are getting the acc. because there is no function on Keras that implements PCK nor PCP.

The function pckh(...) in my code do that. The references are in the paper.

In practical case, you train for x epochs, compute the predictions, and finally compute the PCKh.

SaifAlDilaimi commented 6 years ago

But if I calculate the pckh at the end how do I adjust the learning rate while training? In your paper you wrote that you decrease the learning rate on validation set. But if so on which metric? Loss or Accuracy and if so of which block K?

SaifAlDilaimi commented 6 years ago

@dluvizon can you just tell me on which metric of the validation set you adjust the learning rate?

The learning rate begins at 10^3 and decreases by a factor of 0.4 when accuracy on validation plateaus.

Which accuracy do you mean?

dluvizon commented 6 years ago

Hi, As it is written in the paper, the decrease factor is applied when accuracy on validation plateaus and the prediction taken into account is the last one. There is only one type of accuracy per model, which is the PCKh metric for MPII.

SaifAlDilaimi commented 6 years ago

There is only one accuracy per model, which is the PCKh metric for MPII.

So with model in that context you also mean the Blocks K_i?

SaifAlDilaimi commented 6 years ago

Also: When using tensorboard I have several layers that I can inspect (every layer in the model summary see above) but which layer represent or displays the weights? As I know the conv2d layers but the weights seem as if they don't change:

image image

In your paper you wrote:

The implementation of Soft-argmax can be easily done with recent frameworks, such as TensorFlow, just by concatenating a spatial softmax followed by one convolutional layer with 2 filters of size W x H, with fixed parameters according to equation (3).

Do I need todo here something or is it all done in def build_softargmax_2d(input_shape, name=None) function?

dluvizon commented 6 years ago

I mean, there is only one type o accuracy. If the model has K supervisions, then it has K accuracy scores, but all of them are the same type (PCKh for example) and only the last is used to decrease the lr.