Closed rubeea closed 3 years ago
What dataset did you use?
What dataset did you use?
@jin-s13 It is a custom dataset of power line objects with 3 keypoints as described here ([https://github.com/open-mmlab/mmpose/issues/707#issue-913455927]). The dataset has 200 images in total (160 training and 40 validation). I am using mobilenetv3 to train my detector. Image size 256x256.
OK.
You should also check sigmas during evaluation.
OK.
- You can visualize the results and check whether it is because the misuse of the COCO evaluation tool.
- It seems that you are using COCO evaluation tool. It requires sigmas as the input
You should also check sigmas during evaluation.
@jin-s13 how can I calculate the sigma values for my own dataset? Or should I use another evaluation metric such as PCK, AUC or EPE? which one is preferable to use?
The sigmas measure the labeling error (the std of annotator).
mAP are useful when there are multiple objects in an image. And I recommend using PCK and AUC for evaluation if the image only contains 1 object.
The sigmas measure the labeling error (the std of annotator).
mAP are useful when there are multiple objects in an image. And I recommend using PCK and AUC for evaluation if the image only contains 1 object.
I have multiple objects in the image so I believe that mAP is more suitable. How can we measure sigma if the keypoints are automatically labelled or just one set of keypoints is available that is they are not annotated by multiple people. Can we use any default values?
It affects the mAP a lot. If you do not want to punish keypoints with small errors, you can set a higher sigma.
BTW, please also note that for top-down methods, bbox should be provided for evaluation. Did you prepare your detection bbox? Or did you use gt bbox?
It affects the mAP a lot. If you do not want to punish keypoints with small errors, you can set a higher sigma.
BTW, please also note that for top-down methods, bbox should be provided for evaluation. Did you prepare your detection bbox? Or did you use gt bbox?
What values should I set? <1 or >1? Currently I was using self.sigmas = np.array([ .26, .25, .25 ]) / 10.0 as my sigma values. I am not evaluating on the test set but rather on the validation set only with workflow [('train',1)] and validate=True option. So I am using the gt bboxes for the validation dataset.
You may try larger sigma, as 0.25 is very strict.
e.g. [1.0, 1.0, 1.0] / 10.
Did you set it as True?
You may try larger sigma, as 0.25 is very strict. e.g. [1.0, 1.0, 1.0] / 10.
see
Did you set it as True?
Ok noted. No it is set to false. Should I set use_gt_bbox to true?
Yes, set use_gt_bbox=True
.
Yes, set
use_gt_bbox=True
.
@jin-s13 thanks for all the suggestions. I did as you suggested but still the coco eval metrics are very poor. The loss value and acc_pose are good then why arent the coco metrics improving?
2021-07-08 14:08:52,599 - mmpose - INFO - Epoch [51][50/84] lr: 5.000e-04, eta: 0:12:35, time: 0.094, data_time: 0.043, memory: 143, mse_loss: 0.0007, acc_pose: 0.7867, loss: 0.0007
Also, this is happening when I try to add a new data augmentation technique at the start or end of train pipeline. If I do not add the technique, the COCO eval metrics are good (approximately 75%). Moreover, if I add the proposed augmentation technique to the val pipeline (test time augmentation) I am getting good results.
Sorry for the late reply. I am not very sure what data augmentation did you use. But it is possible, if the network input is heavily changed by the data augmentation. In this case, the learning of the network will be biased.
Sorry for the late reply. I am not very sure what data augmentation did you use. But it is possible, if the network input is heavily changed by the data augmentation. In this case, the learning of the network will be biased.
Hi @jin-s13, Thanks for your reply. One point to be noted here is that when I use the augmentation technique in the val pipeline it does give good results just as you have used TopDownAffine in both train and val pipeline. Does that count as test time augmentation?
Data augmentation is to make the distribution of training data similar to that of the test data. Popular data augmentation tricks are random shift, flip, rotate. The idea is to transform the training set to cover all possible cases in the test set.
"The method simply merges the image with its mask image" If I understand correctly, it seems that in your method the input format is changed. The purpose is not to mimic the distribution of the test set. So it is not a general data augmentation technique.
To achieve good performance, we have to make the distribution of trainset and distribution of testset as similar as possible. That's why you will obtain good performance when you use it in the val pipeline.
Data augmentation is to make the distribution of training data similar to that of the test data. Popular data augmentation tricks are random shift, flip, rotate. The idea is to transform the training set to cover all possible cases in the test set.
"The method simply merges the image with its mask image" If I understand correctly, it seems that in your method the input format is changed. The purpose is not to mimic the distribution of the test set. So it is not a general data augmentation technique.
To achieve good performance, we have to make the distribution of trainset and distribution of testset as similar as possible. That's why you will obtain good performance when you use it in the val pipeline.
@jin-s13 yes I get your point but if the test set itself is very small or limited don't you think it is a good idea to increase both the train and test sets via augmentation techniques as it increases the overall evaluation metrics results?
Yes it is test time augmentation
Yes it is test time augmentation
Ok noted. Thanks for your reply.
Hi, I am trying to implement a new data augmentation technique into the train pipeline. I have incorporated the technique after TopDownAffine in the train pipeline. When I start training the model, I get good acc_pose values and the loss also decreases however when the evaluation is done (after 50 epochs), the resulting metrics are very poor:
Average Precision (AP) @[ IoU=0.50:0.95 | type= all | maxDets= 20 ] = 0.001 Average Precision (AP) @[ IoU=0.50 | type= all | maxDets= 20 ] = 0.009 Average Precision (AP) @[ IoU=0.75 | type= all | maxDets= 20 ] = 0.000 Average Precision (AP) @[ IoU=0.50:0.95 | type=medium | maxDets= 20 ] = 0.000 Average Precision (AP) @[ IoU=0.50:0.95 | type= large | maxDets= 20 ] = 0.002 Average Recall (AR) @[ IoU=0.50:0.95 | type= all | maxDets= 20 ] = 0.015 Average Recall (AR) @[ IoU=0.50 | type= all | maxDets= 20 ] = 0.087 Average Recall (AR) @[ IoU=0.75 | type= all | maxDets= 20 ] = 0.000 Average Recall (AR) @[ IoU=0.50:0.95 | type=medium | maxDets= 20 ] = 0.000 Average Recall (AR) @[ IoU=0.50:0.95 | type= large | maxDets= 20 ] = 0.017
What can be the possible reason for this? Should I incorporate the augmentation method in the val pipeline as well? The method simply merges the image with its mask image and returns the result.