Open laclouis5 opened 5 years ago
There is already MatrixNet in the Roadmap, that is faster and more accurate than CornerNet: https://github.com/AlexeyAB/darknet/issues/3772
I added something like CenterNet: https://github.com/AlexeyAB/darknet/issues/3229#issuecomment-569412122
Thank you for this implementation!
Two different papers exist for CenterNet: Key-point Triplets (the one you implemented) and Objects as Points.
I tried the second one with the author's repo but results were not as good as expected on my personal dataset, Yolo v3 Tiny Pan3 is still more accurate and faster.
I'll post complete results in https://github.com/AlexeyAB/darknet/issues/3874#issuecomment-549470673 as usual in a week or two.
@laclouis5
Did you try CenterNet dla-34 512x512
?
Can you add results to your table? https://github.com/AlexeyAB/darknet/issues/3874#issuecomment-549470673
Also try to train Yolo v3 Tiny Pan3
with pre-trained weights: https://drive.google.com/file/d/18v36esoXCh-PsOKwyP2GWrpYDptDY8Zf/view?usp=sharing
@AlexeyAB
I trained Yolo v3 Tiny Pan3
but with yolov3-tiny.conv.15
as stated in "How to train Tiny Yolo" section, is there a major difference?
I also trained CenterNet dla-34 512x512
as well as CenterNet resnet-18 512x512
.
I'm currently on vacation and I can't access the training server, l'll post everything in some weeks.
@laclouis5
I trained Yolo v3 Tiny Pan3 but with yolov3-tiny.conv.15 as stated in "How to train Tiny Yolo" section, is there a major difference?
No. You can use any of these files.
I also trained CenterNet dla-34 512x512 as well as CenterNet resnet-18 512x512.
I'm currently on vacation and I can't access the training server, l'll post everything in some weeks.
Thanks.
It will also be interesting to compare the speed (FPS) of models:
CenterNet dla-34 512x512
vs CenterNet resnet-18 512x512
vs CSPResNeXt50-PANet-SPP
.
Since it seems CenterNet dla-34 512x512
is slower than stated: https://github.com/WongKinYiu/CrossStagePartialNetworks/issues/1#issuecomment-569684933
@AlexeyAB
Sure, I'll add FPS for all networks including CenterNet. Which command should I use to compute FPS precisely using Darknet framework? I run demo
with -dont_show
then average FPS?
I run demo with -dont_show then average FPS?
Yes, just run ./darknet detector demo ... test.mp4 -dont_show
by using videofile (no video-camera)
@AlexeyAB I just added new results and FPS in https://github.com/AlexeyAB/darknet/issues/3874#issuecomment-549470673.
I will train Yolo v3 Spp Pan Scale
with pre-trained weights when I have GPU time.
@laclouis5 Thanks!
So on GeForce GTX 1060 you get:
yolov3.cfg
/weights
?CenterNet dla-34 512x512
?@AlexeyAB,
Ubuntu 18.04.3 TLS Intel i7-7700 @ 3.6GHz x 8 GeForce GTX 1060 6GB Cuda 10.0 CuDNN 7.6
I got 19 FPS with original Yolo v3 network (544x544), 25 FPS in 512x512. I got 22 FPS with Yolo V3 CSR Spp Panet 512x512.
I used pre-trained weights from ctdet-coco-dla-2x.pth
for CenterNet.
Model | Network Resolution | GTX 1060 FPS | GTX 1080Ti FPS | AP@.5 | AP@.75 | AP |
---|---|---|---|---|---|---|
CenterNet dla-34 | 512x512 | 25 | ~25 | 55.1% | 40.8% | 37.4% |
CenterNet ResNet101 | 512x512 | - | 45 | 53.0% | 36.9% | 34.6% |
csresnext50-panet-spp original-optimal.cfg | 512x512 | 22 | 44 | 64.4% | 45.9% | 42.4% |
yolov3.cfg | 512x512 | 25 | 30 | ~56.0% | ~33.0% | ~32.0% |
Model Network Resolution GTX 1060 FPS GTX 1080Ti FPS AP@.5 AP@.75 AP CenterNet dla-34 512x512 25 ~25 55.1% 40.8% 37.4% CenterNet ResNet101 512x512 - 45 53.0% 36.9% 34.6% csresnext50-panet-spp original-optimal.cfg 512x512 22 44 64.4% 45.9% 42.4% yolov3.cfg 512x512 25 30 ~56.0% ~33.0% ~32.0%
I think the benchmark between yolov3 and centernet-darknet53 would be interesting.
Model Network Resolution GTX 1060 FPS GTX 1080Ti FPS AP@.5 AP@.75 AP CenterNet dla-34 512x512 25 ~25 55.1% 40.8% 37.4% CenterNet ResNet101 512x512 - 45 53.0% 36.9% 34.6% csresnext50-panet-spp original-optimal.cfg 512x512 22 44 64.4% 45.9% 42.4% yolov3.cfg 512x512 25 30 ~56.0% ~33.0% ~32.0%
What is interesting is that Yolo V3 is 1% better in mAP@0.5 than CenterNet dla-34 but far worse in mAP@0.75 and Coco AP (~8% and ~6%).
I noticed the same behaviour on my tests https://github.com/AlexeyAB/darknet/issues/3874#issuecomment-549470673, Center-Net dla-34 has 75.7% mAP@0.5 and 41.6% Coco AP. While this mAP@0.5 is the smallest of my trained networks, Coco AP is on of the best (between Yolo V3 Tiny Pan Mixup (40%) and Yolo V3 Tiny Pan3 (42%)).
My interpretation is that CenterNet has a better precision than Yolo but a worse recall. CenterNet misses lots of detections compared to Yolo but when it detects something the box location and size is better and the label is ok.
For example on your results Yolo V3 is near on par with CenterNet dla-34 for mAP@0.5 but when looking at mAP@0.75 Yolo V3 losses 23% while CenterNet losses only 14%, thus, CenterNet is more precise than Yolo on this example.
Of course newest networks such as CSR50-Panet are better than CenterNet in any category including FPS and all mAPs.
Of course newest networks such as CSR50-Panet are better than CenterNet in any category including FPS and all mAPs.
Would you mind sharing a reference to CSR50-Panet?
@reactivetype
csresnext50-panet-spp-original-optimal.cfg https://github.com/AlexeyAB/darknet#pre-trained-models
My interpretation is that CenterNet has a better precision than Yolo but a worse recall. CenterNet misses lots of detections compared to Yolo but when it detects something the box location and size is better and the label is ok.
For example on your results Yolo V3 is near on par with CenterNet dla-34 for mAP@0.5 but when looking at mAP@0.75 Yolo V3 losses 23% while CenterNet losses only 14%, thus, CenterNet is more precise than Yolo on this example.
@laclouis5 When comparing precision/recall of two detection architectures, it would be a fair comparison if we compare Centernet and Yolo with the same backbone. I suspect the dla-34 backbone may not be efficient and optimal.
In fact, it would be possible to use csresnext50 with Centernet. The good thing about centernet is that it's anchor-free and NMS is optional making post-processing a lot lighter. It maintains precision by enriching the supervision labels. An interesting variant of Centernet is TTFNet, which makes training even faster with better labels. https://arxiv.org/abs/1909.00700
@reactivetype
MatrixNet is better than CenterNet, and
https://arxiv.org/pdf/1908.04646v2.pdf
https://github.com/AlexeyAB/darknet/issues/3772
KP-xNet solves problem (1) of CornerNets because all the matrix layers represent different scales and aspect ratios rather than having them all in a single layer. This also allows us to get rid of the corner pooling operation.
- Why anchor-free is good?
- What mAP can CenterNet achieve without NMS?
MatrixNet is better than CenterNet, and
- MatrixNet uses limitations of size and aspect ratio of object for each detection-layer - like anchors
- MatrixNet uses soft-NMS
Anchor-free is good for faster inference. It seems MatrixNet is a variant of Centernet and CornerNet. Thanks for sharing it.
The figure 1 you shared about compares the model based on number of params. However, it does not always correlate with actual latency.
I see that the authors' report does not also compare the latencies against existing models. (Table 2 in https://arxiv.org/pdf/2001.03194.pdf)
@reactivetype
Anchor-free is good for faster inference. It seems MatrixNet is a variant of Centernet and CornerNet. Thanks for sharing it.
Execution time = 17.9 ms:
[yolo]
layers - 0.3 msget_network_boxes
- 0.4 msdo_nms_sort
- ~0.0 ms
The figure 1 you shared about compares the model based on number of params. However, it does not always correlate with actual latency.
I see that the authors' report does not also compare the latencies against existing models. (Table 2 in https://arxiv.org/pdf/2001.03194.pdf)
Yes, there is no fair comparison of accuracy / speed.
MatrixNet + ResNext101-X
is better than CenterNet + very heavy HourGlass-104
:
CenterNet: https://github.com/xingyizhou/CenterNet#object-detection-on-coco-validation
Hi @AlexeyAB , how are filters calculated in the centernet cfg?
@keko950 Hi, As usual for [Gaussian_yolo] layer
filters=(classes + coords + 1)*<number of mask>
= (classes + 9)*4
@AlexeyAB Hmmm.. the cfg is wrong then?
[convolutional] size=1 stride=1 pad=1 filters=40 activation=linear
[Gaussian_yolo] yolo_point=right_bottom mask = 8,9,10,11 anchors = 8,8, 10,13, 16,30, 33,23, 32,32, 30,61, 62,45, 59,119, 80,80, 116,90, 156,198, 373,326 classes=1 num=12 jitter=.3 ignore_thresh = .7 truth_thresh = 1 iou_thresh=0.213 iou_normalizer=0.5 uc_normalizer=0.5 cls_normalizer=1.0 iou_loss=mse scale_x_y = 1.1 random=0
cfg-file is correct.
Oh yes, as usuall for [Gaussian_yolo], I fixed previous answer.
https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects
when using [Gaussian_yolo] layers, change [filters=57] filters=(classes + 9)x3 in the 3 [convolutional] before each [Gaussian_yolo] layer
Nice, thank you for your time!
CenterNet
Objects as Points seems to achieve a good speed-accuracy tradeoff, better than Yolo v3, and probably better than CornerNet (#3229).
GitHub repo here.
Like CornerNet this one works without anchor boxes (nor NMS) and can regress many other properties with such as 3D location and pose estimation.
May be interesting to test this new detection head with Darknet backbones + PAN instead of Hourglass/DLA.
Speed-accuracy Trade Off (Titan Xp)
Coco Challenge State of the Art Networks
Different Backbones for Speed-Accuracy Tradeoff