How to generate ground truth

liu6381810 commented 7 years ago

Thanks for your great work! Now I have my own dataset(not coco) I want to know how to generate gt(confidence map and PAF) especially I don't know when generate ground truth PAF, how to judge whether a point is in limb, from the paper , the threshold is not clear. I don't find how to generate gt in your code. And what the mask_all and mask_miss mean in coco. Thanks!

michalfaber commented 7 years ago

Have a look here https://github.com/michalfaber/rmpe_dataset_transformer/blob/c138fb82b175d0a331b36e0fa8fc581913ac413f/DataTransformer.cpp#L389 This code builds gt for confidence maps and PAF using keypoints meta.joint_self.joints[i] If you have your own dataset, you will have to somehow annotate those keypoints (coordinates x,y of body parts) I am not so sure about the usefulness of miss masks. Most miss mask's are all '1's Thanks

liu6381810 commented 6 years ago

Thanks for your reply @michalfaber I generate the gt by using function putGaussianMaps and putVecMaps in the code you mentioned And for heat map, the size is 46 * 46, But only few points set to values > 0 with all other points set to 0. When training network, the network tends to predict all the point to 0 and even so the loss is very low because we use mean square error.So is there any trick in training procedure?

michalfaber commented 6 years ago

@liu6381810 This is a really good point. I realized my mistake - I should use euclidean loss instead of mean square error. Thanks

liu6381810 commented 6 years ago

@michalfaber So you mean euclidean loss can work? When I use mse, I found that loss is very low (0.001) but when I predict I found the all the points has same value(nearly 0 ) in the heat map.

liu6381810 commented 6 years ago

@michalfaber I use this loss get same result def euclidean_loss(y_true,y_pred): return K.sum(K.square(y_true-y_pred), axis=-1)

And I found stage1-6 has same loss and same output..it's weird

If i understand correctly, the euclidean loss = 46 46 mean square_error？

michalfaber commented 6 years ago

@liu6381810 Currently, I run training with the loss function: def euclid_loss(y_true, y_pred): return mean_squared_error(y_true, y_pred) * 0.5 This is the same as in caffe http://caffe.berkeleyvision.org/tutorial/layers/euclideanloss.html Indeed it's weird that you get the same loss value in all stages. Maybe because you don't calculate mean in your loss func.

liu6381810 commented 6 years ago

@michalfaber So just multiply a factor 0.5? What means calculate mean in loss func And also when I checked in keras loss source def mean_squared_error(y_true, y_pred): return K.mean(K.square(y_pred - y_true), axis=-1)

axis = -1 means just reduced the last axis so the loss's shape is (batch_size, 46, 46) I thinks maybe it has some problem?

michalfaber commented 6 years ago

I meant that you use K.sum(K.square... instead K.mean(K.square... in your loss function but it would be the same as the standard keras mean_square_error. Yes, multiplication by 0.5 doesn't change anything much but it is 100% the same as in original implementation. mean_square_error should be okay. I still have no idea why you get the same loss in all stages. Did you check if the samples, labels are correct?

liu6381810 commented 6 years ago

@michalfaber I changed some code from your project Maybe the error occurs because of the changes?

What I changed:

the input is just the img (None,None,3) and I drop the vec_weight_input and heat_weight_input Actually I don't know vec_weight_input and heat_weight_input mean. I don't use COCO dataset so my network just has an img input and 6*2 output
The output size is 46 46 14 and 46 46 26 because the dataset has 14 parts and 13 links
I drop the apply_mask
I drop the loss_weight in model.compile(loss_weights=loss_weights) and use Adam Optimizer
I don't use lr_mult
my train_generator's output is (batch_size,368,368,3) and [(batch_size,46,46,26), (batch_size,46,46,14)] * 6
loss is [euclid_loss] 12 as you mentioned above. I have changed to mean_square_error 0.5

When I check the train_generator's output and plot the img and heat map together I think it's right and I don't find the faults

the loss's value is also very weird.

461/13125 [>.............................] - ETA: 22801s - loss: 0.6944 - Mconv5_stage1_L1_loss: 0.0019 - Mconv5_stage1_L2_loss: 7.9071e-04 - Mconv7_stage2_L1_loss: 0.0019 - Mconv7_stage2_L2_loss: 7.9071e-04 - Mconv7_stage3_L1_loss: 0.0019 - Mconv7_stage3_L2_loss: 7.9071e-04 - Mconv7_stage4_L1_loss: 0.0019 - Mconv7_stage4_L2_loss: 7.9071e-04 - Mconv7_stage5_L1_loss: 0.0019 - Mconv7_stage5_L2_loss: 7.9072e-04 - Mconv7_stage6_L1_loss: 0.0019 - Mconv7_stage6_L2_loss: 7.9071e-04

just 461 iterations in epoch 1 the loss is almost the same. Only a little different Mconv7_stage5_L2_loss: 7.9072e-04 Mconv7_stage6_L2_loss: 7.9071e-04

liu6381810 commented 6 years ago

@michalfaber Hi, Could you please give me some advice for this problem? I have try to solve this problem for days there is no progress. It's really strange.

michalfaber commented 6 years ago

@liu6381810 I didn't try with Adam optimizer yet but the original implementation uses multiple learning rates so I wrote custom MultiSGD optimizer. I also noticed significant differences between losses in paf branch and heatmap branch. Maybe the key would be to tweak learning rates for particular branch/stage. I am still experimenting without any significant breakthrough.

liu6381810 commented 6 years ago

Thanks, if you have any breakthrough, please let me know. Thanks for your great help!

liu6381810 commented 6 years ago

@michalfaber Hi, I think maybe I have found the problem In keras, we use mse loss, but in caffe, author uses euclidean loss. If we check the caffe's docs we can find the euclidean loss just divide batch_size and doesn't calculate mean value of the heat map but mse in keras does so euclidean loss is the sum of the heat maps but not mean values But we use the same learning rate 4e-5 as the author's, so it's the problem.

I see the output loss from the author's code repertory, the first iteration has 6777 loss if we use mse we can just get loss in the range(0,10) in the first iteration so the learning rate is too large for use mse as loss

here is the author's loss of the first iteration I0902 21:07:19.809747 23236 solver.cpp:228] Iteration 0, loss = 6777.38 I0902 21:07:19.809798 23236 solver.cpp:244] Train net output #0: loss_stage1_L1 = 174.77 ( 1 = 174.77 loss) I0902 21:07:19.809808 23236 solver.cpp:244] Train net output #1: loss_stage1_L2 = 953.576 ( 1 = 953.576 loss) I0902 21:07:19.809813 23236 solver.cpp:244] Train net output #2: loss_stage2_L1 = 174.855 ( 1 = 174.855 loss) I0902 21:07:19.809820 23236 solver.cpp:244] Train net output #3: loss_stage2_L2 = 953.554 ( 1 = 953.554 loss) I0902 21:07:19.809837 23236 solver.cpp:244] Train net output #4: loss_stage3_L1 = 174.817 ( 1 = 174.817 loss) I0902 21:07:19.809844 23236 solver.cpp:244] Train net output #5: loss_stage3_L2 = 954.969 ( 1 = 954.969 loss) I0902 21:07:19.809881 23236 solver.cpp:244] Train net output #6: loss_stage4_L1 = 174.805 ( 1 = 174.805 loss) I0902 21:07:19.809890 23236 solver.cpp:244] Train net output #7: loss_stage4_L2 = 955.595 ( 1 = 955.595 loss) I0902 21:07:19.809895 23236 solver.cpp:244] Train net output #8: loss_stage5_L1 = 174.755 ( 1 = 174.755 loss) I0902 21:07:19.809901 23236 solver.cpp:244] Train net output #9: loss_stage5_L2 = 954.972 ( 1 = 954.972 loss) I0902 21:07:19.809907 23236 solver.cpp:244] Train net output #10: loss_stage6_L1 = 174.842 ( 1 = 174.842 loss) I0902 21:07:19.809913 23236 solver.cpp:244] Train net output #11: loss_stage6_L2 = 955.873 ( 1 = 955.873 loss)

After I changed the loss I get a reasonable result. Thank you!

mingmingDiii commented 6 years ago

@liu6381810 Sorry to bother. Did you get the loss 6777 by using loss function: def euclidean_loss(y_true,y_pred): return K.sum(K.square(y_true-y_pred), axis=-1)? I use this function and get the loss value is about 9.9. What's wrong with that? Thanks very much

liu6381810 commented 6 years ago

@mingmingDiii def euclidean_loss(y_true, y_pred): return K.sum(K.square(y_true - y_pred), axis = [1,2,3]) * 0.5

And low loss maybe occur because you generate ground truth heat map by low sigma

michalfaber commented 6 years ago

@liu6381810 Great thanks for the clue. I've updated loss function and now it finally works!

mingmingDiii commented 6 years ago

@liu6381810 Thanks for your reply! I will try it.

liu6381810 commented 6 years ago

@michalfaber Hi sorry to bother you again I want to know what's the sigma's value when you generate the ground truth heatmap I find that in your loss.png the stage6_L1 loss is always about 2.5 times of the stage6_L2 loss When I choose the sigma to default 7 the L1 loss is about 90 and L2 loss is about 17 at first When I run 20 epochs(210000 images) the L1 loss is 60 and L2 loss is 14 and the result is not good But when I choose the sigma to 16 and just 10 epochs I can get a better and more reasonable result. My loss is defined as :K.sum(K.square(y_true - y_pred), axis = [1,2,3]) * 0.5 Could you please give me any advice? Thanks!

ksaluja15 commented 6 years ago

@michalfaber @liu6381810 Same problem as above. The stage 6 L1/L2 loss reaches 66/25 level but the results are not good. Any idea why?

ksaluja15 commented 6 years ago

was able to replicate the results using tensorflow backend @michalfaber . I was trying out mxnet backend version, most likely there was a bug in my code. Will make a PR soon. Thanks for sharing your code :)

piperod commented 6 years ago

Hi @michalfaber thanks a lot for this great repo.

I am trying to train on a different dataset. I have the annotations for the keypoints. However I am a little bit lost in the flow to get the dataset in the correct format. From what I understand from the code is 1. generate_masks.py, 2. generate_hdf5.py and 3. ds_generator_client.py. is that correct? My question is that since I have a different dataset I m not sure if I need the segmentation for the maskin or how should I generate this masking in order to fit the generator. Thanks in advance, any help very welcome.

anatolix commented 6 years ago

Hi, I am now trying to implement pure python training script, now I somewhere in the middle of the road, current code is copy-pasted from original work and it is not very easy to understand.

But the idea is following: there is a good chance you don't really need masking and segmentation, just keypoints.

We have 2 mask: mask and mask miss.

About mask_miss my understanding is the following: 1) CoCo dataset contains segmentation and keypoints. 2) Some people has keypoints, some has no keypoints - just segmentation. For example central person here has keypoints left and right just segmented pers 3) When you train algorithm you basically give NN heatmaps for joints and limbs, no masks 4) There is a chance NN will find joints in other person, not only person you actually have keypoints. In this case we shouldn't calculate loss for masked area because we could fine NN for correct answer, we just don't know it is correct for that we need mask miss, which looks like this maskmiss On the original image masked 5) This mask created from other ppl segmentation, so there is the only place you really need segmentation.

About just "mask". It contains pixel segment borders. For my perception it never used in the algorithm, may be it was created for some visualisation purposes.

So the idea if you have keypoints for 100% ppl in picture you don't really need any masks. For lot of pictures in CoCo mask is actually empty

piperod commented 6 years ago

@anatolix Thank you so much. A lot clearer now. I think I was overthinking too much.

hellojialee commented 6 years ago

Hi everyone. How to handle the joints of instance which is crowded in annotation？ It seems that the keypoints of these areas are not annotated.

hellojialee commented 6 years ago

OK， got it. We will ignore an area if there is at least one instance is not annotated.

jjjkkkjjj commented 3 years ago

@liu6381810

Thanks for your reply @michalfaber I generate the gt by using function putGaussianMaps and putVecMaps in the code you mentioned And for heat map, the size is 46 * 46, But only few points set to values > 0 with all other points set to 0. When training network, the network tends to predict all the point to 0 and even so the loss is very low because we use mean square error.So is there any trick in training procedure?

These conversations were too old. But I really want to know how to resolve the problem you mentioned. Did you resolve the problem which heatmaps will be all 0? Was it correct that changing mean square loss into euclidean loss? I hope your reply! Thanks!

michalfaber / keras_Realtime_Multi-Person_Pose_Estimation

How to generate ground truth #8