Closed LukeAI closed 3 years ago
[convolutional] size=3 stride=2 filters=256 activation=leaky antialiasing=1
will be used
[convolutional] size=3 stride=1 filters=256 activation=leaky
[convolutional] size=3 stride=2 filters=256 groups=256 activation=linear
with hardcoded weights:- | - | - |
---|---|---|
1/16 | 2/16 | 1/16 |
2/16 | 4/16 | 2/16 |
1/16 | 2/16 | 1/16 |
[maxpool] size=2 stride=2 antialiasing=1
will be used
[maxpool] size=2 stride=1
[convolutional] size=3 stride=2 filters=N_channels groups=N_channels activation=linear
with hardcoded weights:- | - | - |
---|---|---|
1/16 | 2/16 | 1/16 |
2/16 | 4/16 | 2/16 |
1/16 | 2/16 | 1/16 |
project page: https://richzhang.github.io/antialiased-cnns/
paper : https://arxiv.org/abs/1904.11486
video: https://www.youtube.com/watch?time_continue=74&v=HjewNBZz00w
May be better to use tri-3
: Triangle-3 bluring with coefs [1, 2, 1]
- bilinear downsampling
- | - | - |
---|---|---|
1 | 2 | 1 |
2 | 4 | 2 |
1 | 2 | 1 |
bin-5
- Binomial-5 - Just take 5x5
window, multiply elements by these values along x and y [1., 4., 6., 4., 1.]
, and divide result by 256
: https://github.com/adobe/antialiased-cnns/blob/430d54870a2c1c5b258fd38f5f796df44aefee79/models_lpf/__init__.py#L39
Read - Page 1, first table, Index N = 4
: http://web.archive.org/web/20100621232359/http://www-personal.engin.umd.umich.edu/~jwvm/ece581/21_GBlur.pdf
kernel_size = 5 stride = 1 coefficient weights =
- | - | - | - | - |
---|---|---|---|---|
1 | 4 | 6 | 4 | 1 |
4 | 16 | 24 | 16 | 4 |
6 | 24 | 36 | 24 | 6 |
4 | 16 | 24 | 16 | 4 |
1 | 4 | 6 | 4 | 1 |
Different blurs are used:
you added "antialiasing=1" to convolutional layers? Awesome! So I can test it by adding that parameter to every [convolutional] layer throughout the .cfg?
@LukeAI Yes.
you can add antialiasing=1
to every [convolutional]
which is with stride>1
or stride_x>1
or stride_y>1
, since antialiasing has meaning only for stride>1
after that you should re-train / finetune your model
today I will try to implement it on CPU too
today I will try to add antialiasing=1
for [maxpool]
with stride>1
or stride_x>1
or stride_y>1
I think some of these features should solve the problem of re-identification (blinking-issue).
antialiasing=1
scale_x_y=
params https://github.com/AlexeyAB/darknet/issues/3293ok, I'll wait until you have added antialiasing to maxpool before I retrain.
@LukeAI Did you understand, should we use antialiasing=1
for every stride=2 layer except the 1st stride=2 layer?
@LukeAI I added antialiasing=1
for [maxpool]
with stride>1
or stride_x>1
or stride_y>1
ok well I have done as you suggested (see attached) trying it out now. I hadn't realised that almost all of yolov3-spp is stride=1 so I guess this won't make too much difference but I'll let you know. yolo_v3_spp_antialias.cfg.txt
ImageNet BFLOPs: 0.858 Top-1: 56.3 (expected value is ~60) Top-5: 79.5
ImageNet BFLOPs: 0.970 Top-1: 54.5 (expected value is ~60) Top-5: 77.9
i also trained another two densenet-based models. all of these models get worse results after add antialiasing=1.
@AlexeyAB @LukeAI Hi, did you try to set random = 1 when you added antialiasing=1, it's seems a bug when both random and antialiasing =1, even set subvision = 64 will got 'out of CUDA memory', but with the max image size (e.g. 608 in my case) if random = 0, the training is working normally
random=1 does indeed increase the memory requirements so this probably isn't a bug. If you want to use random=1, try decreasing the training resolution, you can always increase it again at inference time.
I just tried training with antialiasing=1 in convolutional layers with stride=2 except for the very first one. I found that it made no real difference with antialiasing: without:
@LukeAI What dataset did you use? And what model did you use?
It was a private urban roads dataset. yolo_v3_spp_scale_swish_aa.cfg.txt
@LukeAI Also try to get cfg/weights without antialiasing=1
antialiasing=1
and check mAP again without retraining, will be mAP higher?Have tried doing so, adding antialiasing=1 led to broadly worse results - mostly weaker recall. model trained and evaluated without aa:
class_id = 0, name = Car, ap = 52.54%, Precision = 0.61, Recall = 0.51, avg IOU = 0.49%, TP = 233, FP = 149
class_id = 1, name = Person, ap = 77.06%, Precision = 0.93, Recall = 0.76, avg IOU = 0.69%, TP = 358, FP = 29
class_id = 2, name = Truck, ap = 63.67%, Precision = 0.75, Recall = 0.61, avg IOU = 0.58%, TP = 476, FP = 161
class_id = 3, name = Traffic_light, ap = 56.18%, Precision = 0.56, Recall = 0.68, avg IOU = 0.38%, TP = 116, FP = 92
class_id = 4, name = Trailer, ap = 71.56%, Precision = 0.83, Recall = 0.67, avg IOU = 0.66%, TP = 268, FP = 56
for conf_thresh = 0.10, precision = 0.75, recall = 0.64, F1-score = 0.69
for conf_thresh = 0.10, TP = 1451, FP = 487, FN = 826, average IoU = 57.76 %
IoU threshold = 50 %, used Area-Under-Curve for each unique Recall
mean average precision (mAP@0.50) = 0.642031, or 64.20 %
same model, weights, with aa added to cfg
class_id = 0, name = Car, ap = 44.06%, Precision = 0.92, Recall = 0.41, avg IOU = 0.74%, TP = 188, FP = 17
class_id = 1, name = Person, ap = 62.32%, Precision = 0.92, Recall = 0.54, avg IOU = 0.66%, TP = 254, FP = 21
class_id = 2, name = Truck, ap = 39.23%, Precision = 0.76, Recall = 0.33, avg IOU = 0.58%, TP = 260, FP = 83
class_id = 3, name = Traffic_light, ap = 34.33%, Precision = 0.48, Recall = 0.48, avg IOU = 0.33%, TP = 81, FP = 88
class_id = 4, name = Trailer, ap = 54.51%, Precision = 0.83, Recall = 0.48, avg IOU = 0.65%, TP = 190, FP = 39
for conf_thresh = 0.10, precision = 0.80, recall = 0.43, F1-score = 0.56
for conf_thresh = 0.10, TP = 973, FP = 248, FN = 1304, average IoU = 60.17 %
IoU threshold = 50 %, used Area-Under-Curve for each unique Recall
mean average precision (mAP@0.50) = 0.468911, or 46.89 %
So may be it doesn't give any advantage for this dataset.
Did you check the mAP on separate validation dataset?
@LukeAI
I added antialiasing=2
so you can try to use it. It uses 2x2 filters instead of 3x3 filters.
There are also several changes:
Hey, I'll give this another go when I get GPU time - so I should add antialiasing=2 to all conv layers with stride=2 except the first one?
@LukeAI Yes. But I don't know will it bring any improvement in mAP.
I think better to try iou_thresh=0.3
param in yolo layers.
@WongKinYiu Did you try AntiAliasing, and did you get any boost? I didn't understand it correctly (escription of my understanding https://github.com/AlexeyAB/darknet/issues/3672#issuecomment-515779175 ), or is +1-2% Top1 with AntiAliasing just a fake?
@AlexeyAB
No ,I did not get any boost in my experiments. https://github.com/AlexeyAB/darknet/issues/3672#issuecomment-533883993
I think it is because of that we use shift-based data augmentation (random crop).
@WongKinYiu Yes, random-crop solves shift-issue. Random-crop allows to remember all shifts. But I thought may be antialiasing=1 would not require remember shifts, therefore, accuracy will be the same, but will require fewer filters. But it seems antialiasing=1 even decreases accuracy: https://github.com/AlexeyAB/darknet/issues/3672#issuecomment-533883993
@AlexeyAB Yes, it seems decreases accuracy in this implementation. I think we need do corresponding back-propagation of anti-aliasing.
@WongKinYiu
There is back-propagation for anti-aliasing.
Just there was fixed a bug 26 Oct: https://github.com/AlexeyAB/darknet/commit/29c71a190acb82aa4beda8762e087b658f4b0347 https://github.com/AlexeyAB/darknet/blob/213b82a1bd5ea6b0679c28fc1a78932453c4766e/src/convolutional_kernels.cu#L628-L642
@AlexeyAB Oh!
My models are trained before 22 Sep, maybe I should retrain the models to get accurate results.
And do you think we need corresponding back-propagation of anti-aliasing pooling? for example, global avgpool do
state.delta[in_index] += l.delta[out_index] / (l.h*l.w)
then global anti-aliasing need do
state.delta[in_index] += l.delta[out_index] * (blur_mask[i] / sum_of_blur_mask);
for normal anti-aliasing we also need do corresponding back-propagation.
@WongKinYiu
Do you mean?
[avgpool]
antialiasing=1
state.delta[in_index] += l.delta[out_index] * (blur_mask[i] / sum_of_blur_mask);
How do we get blur_mask[]
?
for normal anti-aliasing we also need do corresponding back-propagation.
What is it normal anti-aliasing. I implemented anti-aliasing just as common depth-wise [convolutional]-layer with fixed weights.
@AlexeyAB
I mean
[maxpool]
antialiasing=1
for example,
currently the blur_mask of blure_size=2
is:
1 | 1 |
1 | 1 |
It equivalent we do maxpool(size=2, stride=1) then do avgpool(size=2, stride=2) in forward pass. But the backward pass seems only considerate the maxpool part. We need do backward pass of avgpool(size=2, stride=2) then do backward pass of maxpool(size=2, stride=1).
The blur_mask of blure_size!=2
is:
1 | 2 | 1 |
2 | 4 | 2 |
1 | 2 | 1 |
Blur down-sampling can be seen as a constant weighted convolutional layer. So we need do corresponding backward pass of blur down-sampling.
@WongKinYiu There is back-propagation for antialiasing in the [maxpool] layer for training on GPU: https://github.com/AlexeyAB/darknet/blob/649abac372446e6c0114e8fbc9bbbb8b226318b9/src/maxpool_layer_kernels.cu#L195-L209
I added it for GPU. But I didn't add for CPU, because no one is training at the CPU anyway. And if it does not work, then I will remove anti-aliasing altogether.
Do you try to use AntiAliasing for Classifier or for Detector currently?
@AlexeyAB Hello,
I will retrain the models next week.
@AlexeyAB
model | top-1 | top-5 |
---|---|---|
original Model A | 70.9 | 90.2 |
old aa Model A | 69.8 | 89.5 |
new aa Model A | 69.9 | 89.4 |
original Model B | 70.2 | 89.7 |
old aa Model B | 68.9 | 88.9 |
new aa Model B | 68.9 | 88.8 |
@WongKinYiu Thanks! So I think it should be removed.
@AlexeyAB can we still use "antialiasing=1" in our cfg.
@israfila3 It is deprecated, so I will remove it for 2 months, since it doesn't give any advantage.
@AlexeyAB thanks for your reply.. Actually i am making some report and i wanted to add "antialiasing" results in the report. Is there any chance to use it. I have updated this "Darkent repository" on 20 December
@israfila3 Yes. It works, only if random=0 in cfg-file.
This technique is reported to give a small, "free" boost to accuracy, mitigating aliasing effects within the network: https://github.com/adobe/antialiased-cnns