Open ssanzr opened 6 years ago
@ssanzr Yolo v3 should have much more accuracy than Yolo v3 tiny. And high resolution should increase accuracy.
detector map
command to get mAP?1.What params did you use in the Makefile? I used MSVS2017. CUDA and CUDNN enabled. Do you mean other parameters?
2.Did you check your dataset by Yolo_mark? I labeled it with Yolo_Mark
3.Can you show content of file bad.list and badlabel.list if it will be created after training? files bad.list: "C:\darknet-master\data\dog.jpg" -c0 "C:\darknet-master\data\giraffe.jpg" C:\VMData\train.txt C:\VMData\train.txt C:\VMData\train.txt
bad_label.list not found. How can i enable its creation?
Did you use detector map command to get mAP? Yes
What command did you use to get anchors? I did not modify anything from the default .cfg file
Show your anchors. [yolo] mask = 6,7,8 anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 [yolo] mask = 3,4,5 anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 [yolo] mask = 0,1,2 anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326
Why did you train only 2900 iterations for Yolo V3 608x608 ? Avg loss was already quite ok. I will train it further
Attach your yolo v3 cfg-file. yolov3-S.txt
@ssanzr
bad_label.list not found. How can i enable its creation?
bad_label.list file will be created automatically only if your dataset is wrong. So your dataset is correct.
Also Yolo v3 requires more iterations to get high accuracy (mAP), so in general you should train it more iterations than Yolo v3 tiny.
Try to calculate anchors, show new anchors and drag-n-drop to your message the screenshot of showed cloud of points:
darknet.exe detector calc_anchors data/obj.data -num_of_clusters 9 -width 832 -height 832 -show
Try to:
width=832 height=832
-width 832 -height 832
steps=40000,45000
Do your objects depend on colors, so you disabled color data augmentation?
@AlexeyAB thanks a lot for your support here.
With Yolo Tiny v3 i am able to get very good results with 416x416 Images, so i will try this resolution for YoloV3; getting 50000 iterations with large images will take a couple of weeks on my system. I will change the anchors and the steps as you suggested. Let me know if this does not sound ok.
Anyways i am very curious to understand why the Yolo Tiny works quite good "out of the box" and not YoloV3 for the same input. Lets see if it makes sense after this round of testing
Anchors:
num_of_clusters = 9, width = 416, height = 416 read labels from 646 images loaded image: 646 box: 1232 all loaded.
calculating k-means++ ... avg IoU = 85.78 %
Saving anchors to the file: anchors.txt anchors = 15.6920,17.3646, 16.6959,23.1417, 21.8907,19.4287, 17.8471,29.0398, 22.1554,25.0664, 17.6984,37.4543, 29.0695,23.4917, 24.8681,30.5106, 31.4007,36.9213
I am using grayscale images, so i disabled color data augmentation.
I am training with .png images, do you see any issue with this.
Tonight i was training further YoloV3 for 608x608 and the mAP has increased significantly but not the average IoU. See updated table below:
Neural Network | Input Resolution | Iterations | Avg loss | average IoU(%) | mAP (%) |
---|---|---|---|---|---|
Tiny Yolo V3 | 416x416 | 42000 | 0.1983 | 45.59% | 61.18% |
Tiny Yolo V3 | 608x608 | 21700 | 0.3469 | 46.39% | 61.29% |
Tiny Yolo V3 | 832x832 | 55200 | 0.2311 | 48.69% | 56.77% |
Yolo V3 | 416x416 | 19800 | 0.1945 | 0.00% | 0.00% |
Yolo V3 | 608x608 | 2900 | 0.71 | 42.63% | 23.46% |
**Yolo V3 | 608x608 | 5400 | 0.71 | 39.02% | 52.88%** |
Yolo V3 | 832x832 | 5600 | 0.3324 | 38.77% | 41.20% |
@ssanzr
**Yolo V3 608x608 5400 0.71 39.02% 52.88%**
width=608 height=608
, or did you train with 416x416 but used 608x608 only for detection?random=1
for training?Anchors: num_of_clusters = 9, width = 416, height = 416 ... anchors = 15.6920,17.3646, 16.6959,23.1417, 21.8907,19.4287, 17.8471,29.0398, 22.1554,25.0664, 17.6984,37.4543, 29.0695,23.4917, 24.8681,30.5106, 31.4007,36.9213
For your dataset the Yolo v3 with high resolution should give much higher accuracy.
Try to calculate anchors for 832x832:
darknet.exe detector calc_anchors data/obj.data -num_of_clusters 9 -width 832 -height 832 -show
And Train Yolo v3 with width=832 height=832
and random=1
at least 15 000 - 20 000 iterations - you will get much higher accuracy.
@AlexeyAB
Did you train with width=608 height=608, or did you train with 416x416 but used 608x608 only for detection?
I trained with 608x608 and i detected with 608x608
Did you use random=1 for training?
Yes
Did you use anchors that is calculated for 416x416 for training on 608x608?
This trial was done with the default anchors. I will try now with the calculated anchors:
num_of_clusters = 9, width = 608, height = 608 read labels from 646 images loaded image: 646 box: 1232 all loaded.
calculating k-means++ ... avg IoU = 85.78 %
Saving anchors to the file: anchors.txt anchors = 22.9345,25.3790, 24.4016,33.8224, 31.9940,28.3958, 26.0843,42.4427, 32.3810,36.6355, 25.8669,54.7409, 42.4861,34.3341, 36.3457,44.5924, 45.8934,53.9618
Why does the Yolo Tiny work quite good "out of the box" for 416x416 and not YoloV3 for the same input and resolution? Any hypothesis?
Using the command below i am able to create the results .avi file with tiny Yolo but for some reason, in each (n) frame, i the bounding boxes for the (n-1) frame are plotted. Do you know how to fix this?
194 MB VOC-model - save result to the file res.avi: darknet.exe detector demo data/voc.data yolo-voc.cfg yolo-voc.weights test.mp4 -i 0 -out_filename res.avi
@ssanzr
Yolo V3 416x416 19800 0.1945 0.00% 0.00%
It seems that something went wrong.
Why does the Yolo Tiny work quite good "out of the box" for 416x416 and not YoloV3 for the same input and resolution? Any hypothesis?
You should train Yolo v3 much more iterations.
Using the command below i am able to create the results .avi file with tiny Yolo but for some reason, in each (n) frame, i the bounding boxes for the (n-1) frame are plotted. Do you know how to fix this?
For video Yolo averages detections for (n-2), (n-1), n
frames, and shows these boxes on (n-1)
frame. Just set #define FRAMES 3
to disable it: https://github.com/AlexeyAB/darknet/blob/99c92f98e08c007b23b21d2e0887a59f14045efb/src/demo.c#L18
@AlexeyAB Your support is really great, and really appreciated!!
Do you use CUDNN_HALF=1 in the Makefile?
Using the MSVS 2017. Where can i change this?
What GPU do you use?
NVIDIA GeForce GTX 1060 with Max-Q Design
What is the date of your code from this repository? https://github.com/AlexeyAB/darknet
June 12
For video Yolo averages detections for (n-2), (n-1), n frames, and shows these boxes on (n-1) frame. Just set #define FRAMES 3 to disable it: https://github.com/AlexeyAB/darknet/blob/99c92f98e08c007b23b21d2e0887a59f14045efb/src/demo.c#L18
Sorry, i am quite beginner. Do you mean replacing "#define FRAMES 3" by "#define FRAMES 1"?.
I believe the video is not averaging the position in n-1, i have just checked it and it really seems that frame n is showing the bounding x box label for n+1. I think i explained wrongly before.
@ssanzr
Do you mean replacing "#define FRAMES 3" by "#define FRAMES 1"?
Yes.
Do you use CUDNN_HALF=1 in the Makefile?
Using the MSVS 2017. Where can i change this?
What GPU do you use?
NVIDIA GeForce GTX 1060 with Max-Q Design
This is normal. For your GPU you shouldn't use CUDNN_HALF.
Hi again @AlexeyAB
It seems that setting new anchors calculated for -width 608 -height 608 an setting steps steps=40000,45000 does make the performance worse.
Neural Network | Input Resolution | Weights | Iterations | Avg loss | average IoU(%) | mAP (%) |
---|---|---|---|---|---|---|
Yolo V3 | 608x608 | yolov3-VM_608.cfg | 2900 | 0.71 | 42.63% | 23.46% |
Yolo V3 | 608x608 | yolov3-VM_608-steps-anchors.cfg | 7000 | 0.5074 | 9.24% | 3.69% |
I will keep training and i will let you know the progress
@AlexeyAB
I continue training, but the results does not see to be improving. The avg loss is slightly decreasing, but the avg IOU and mAP are not improving, or even getting worst.
Any other idea that might help here?
Neural Network | Input Resolution | Weights | Iterations | Avg loss | average IoU(%) | mAP (%) |
---|---|---|---|---|---|---|
Yolo V3 | 608x608 | yolov3-VM_608-steps-anchors.cfg | 7000 | 0.5074 | 9.24% | 3.69% |
Yolo V3 | 608x608 | yolov3-VM_608-steps-anchors.cfg | 8700 | 0.4312 | 55.05% | 14.26% |
Yolo V3 | 608x608 | yolov3-VM_608-steps-anchors.cfg | 10200 | 0.3289 | 43.50% | 52.41% |
Yolo V3 | 608x608 | yolov3-VM_608-steps-anchors.cfg | 11900 | 0.3306 | 46.67% | 30.93% |
Yolo V3 | 608x608 | yolov3-VM_608-steps-anchors.cfg | 16000 | 0.1763 | 33.80% | 11.12% |
@ssanzr This is very strange.
Did you change anchors in all 3 [yolo]
-layers?
What batch= and subdivision= did you use for training 416x416 and 608x608?
If the batch was the same, but subdivisions was smaller for 416x416, then minibatch=batch/subdivisions
will be larger and it can better train the network 416x416. So it might be the reason that 608x608 has lower mAP than 416x416.
But for your small objects - the higher resolution should give more advantages than higher minibatch.
I have 700 training images and 300 for testing.
num_of_clusters = 9, width = 608, height = 608 read labels from 646 images loaded image: 646 box: 1232 all loaded.
calculating k-means++ ... avg IoU = 85.78 %
Saving anchors to the file: anchors.txt anchors = 22.9345,25.3790, 24.4016,33.8224, 31.9940,28.3958, 26.0843,42.4427, 32.3810,36.6355, 25.8669,54.7409, 42.4861,34.3341, 36.3457,44.5924, 45.8934,53.9618
train=test.txt
in the obj.data
file and run:
darknet.exe detector calc_anchors data/obj.data -num_of_clusters 9 -width 608 -height 608 -show
Can you show sreenshot of cloud of points and anchors for Test dataset?If anchors in Test dataset are very different from anchors from Training dataset - it might be the reason that 608x608 has lower mAP than 416x416.
@AlexeyAB
What batch= and subdivision= did you use for training 416x416 and 608x608?
416x416: batch=64 subdivisions=32 608x608 batch=64 subdivisions=64
This values are chosen based on the maximum i achieve without CUDA error
But for your small objects - the higher resolution should give more advantages than higher minibatch.
It makes sense to me. In tiny Yolo there is no big difference between different resolutions
Neural Network | Input Resolution | Weights | Iterations | Avg loss | average IoU(%) | mAP (%) |
---|---|---|---|---|---|---|
Tiny Yolo V3 | 416x416 | yolov3-tiny-VM.cfg | 42000 | 0.1983 | 45.59% | 61.18% |
Tiny Yolo V3 | 608x608 | yolov3-tiny-VM_608.cfg | 21700 | 0.3469 | 46.39% | 61.29% |
Tiny Yolo V3 | 832x832 | yolov3-tiny-VM_832.cfg | 55200 | 0.2311 | 48.69% | 56.77% |
Did you train on training dataset(700 images) and did you calculate mAP on testing dataset (another 300 images)? It is recommended to use at least 2000 images per class, so perhaps you met overfitting, therefore you see mAP decreasing.
Yes
num_of_clusters = 9, width = 608, height = 608 read labels from 646 images loaded image: 646 box: 1232 all loaded.
calculating k-means++ ... avg IoU = 85.78 %
Saving anchors to the file: anchors.txt anchors = 22.9345,25.3790, 24.4016,33.8224, 31.9940,28.3958, 26.0843,42.4427, 32.3810,36.6355, 25.8669,54.7409, 42.4861,34.3341, 36.3457,44.5924, 45.8934,53.9618
num_of_clusters = 9, width = 608, height = 608 read labels from 301 images loaded image: 289 box: 557 all loaded.
calculating k-means++ ... avg IoU = 86.85 %
Saving anchors to the file: anchors.txt anchors = 4.2752,3.3777, 22.6959,33.0490, 23.9492,41.1689, 27.6245,48.9778, 34.1149,40.1174, 26.7869,57.7545, 35.4523,50.2061, 35.8496,58.8041, 32.4750,70.3333
@ssanzr
416x416: batch=64 subdivisions=32 608x608 batch=64 subdivisions=64
May be this is the reason. Try to train 416x416 batch=64 subdivisions=64
, and if it will still give higher mAP than 608x608 batch=64 subdivisions=64
, then I think the problem is somewhere else, may be in the dataset.
Did you change anchors in all 3 [yolo]
-layers?
Can you show screenshots from Yolo_mark of some images from you training dataset with marked bounded boxes?
check that each object are mandatory labeled in your dataset - no one object in your data set should not be without label. In the most training issues - there are wrong labels in your dataset (got labels by using some conversion script, marked with a third-party tool, ...). Always check your dataset by using: https://github.com/AlexeyAB/Yolo_mark https://github.com/AlexeyAB/darknet#how-to-improve-object-detection
@AlexeyAB @ssanzr I have also encountered the same problem using custom dataset (raccoon dataset from experiencor/keras-yolo3). Use tiny yolov3 has a much higher accuracy than yolov3.
Actually, I found that yolov3 is very sensitive to the anchors from dimension clustering. When using 9 anchors (yolov3) instead 6 anchors (tiny yolov3), some problems caused. Maybe this is the reason for your low accuracy.
Also, I think dimension clustering has some problem on small dataset. The generated anchors are usually very close to each other and this lead to low accuracy.
That is a good point, @ZHANGKEON @. I will try YoloV3 with the 6 anchors from Yolo tiny V3 and see what happens.
@ssanzr Maybe you can just try with original yolov3 anchors on your dataset to see how the accuracy changes.
Hi @ssanzr As you have said that you're using grayscale images.
- I am using grayscale images, so i disabled color data augmentation.
- I am training with .png images, do you see any issue with this.
I'm trying to run Yolov3 on grayscale by changing color channel=1 in yolov3.cfg file But I'm getting segmentation error. I also tried the solution by reducing subdivision and random=0. I also tried by changing data.c detection_load function's line as hw; instead of hw. I tried different methods to solve this problem but still I'm unable to perform detection on grayscale.
Would you please guide me that what should I do now to use grayscale images with Yolov3?
Thank You
@samgithub11
I have never seen any segmentation error with my grayscale images. Actually. My issue was that for Yolo Tiny V3 worked, and for Yolo V3, it did not.
Sorry for being of big help...
A quick reason is overfitting. The dataset is too small and yolov3 is deep which yolov3_tiny is small. This can cause this problem. When you use the model in the real environment, the well-trained yolov3 has a better performance.
@AlexeyAB Hey, I'm using repository from (https://github.com/AlexeyAB/darknet) and am trying to train for my own dataset with 200 Images.
Here is my Question 1) How to increase my training speed?in makefile enabled GPU=1,cuDNN=1 2) How to use more GPU in GCP?available gpu 15gb.
@Mahibro Hi,
-gpus 0,1,2,3
https://github.com/AlexeyAB/darknet#how-to-train-with-multi-gpu@AlexeyAB If i set Batch & Subdivision=1(Training will not progress,it will ask to set batch=64).Should i make any changes to use GPU? (please elaborate me i dont know much).
@Mahibro You must set batch=64.
Read this: https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects
@AlexeyAB Hi, thanks for your help.
I have a small dataset, about 1000 images, I am training my own dataset on Yolov3. In training, the avg loss is decreasing but when I check it with a new dataset, the result of weights of 1000th iteration is better than 4000th one. Am I facing overfitting or should I run it for more iterations?
How many iterations is good for Yolov3, for a small dataset with 8 classes? Should I use tiny yolov3 instead?
Thanks again!
@aaobscure What mAP do you get for weights of 1000 and 4000 iterations?
you @AlexeyAB For 1000 I get 1.2 For 4000 I get 0.72
another question is that it is not decreasing lot after 3000 iterations, what should I do?
@aaobscure Do you get only 0.72% ? It is very low.
How many iterations is good for Yolov3, for a small dataset with 8 classes?
You should have ~16 000 images and you should train ~16 000 iterations with batch=64.
Should I use tiny yolov3 instead?
You can try
You should have ~16 000 images and you should train ~16 000 iterations with batch=64. I have just 1000 images. what should do?
another question is that how often I should change the learning rate? and how to do it?
@ssanzr @AlexeyAB hi, before I used yolov3, tiny yolov3 and tiny yolov3-xnor for my detection system, but the dataset I used was colored image data. but this time I want to try with gray scale image data. and what I want to ask is.
How and what config do I need to change to train my gray scale image?
How to run the gray scale image training results on a demo using a webcam by changing its appearance to gray scale?
I'm sorry for my bad English
@fashlaa Hi, Just set channels=1
in the [net]
section in cfg-file.
@AlexeyAB Previously I used your old version of the Yolo Darknet repository. Does it support for your old version? and if when I run the webcam demo open directly with gray scale video streams?
Hey @AlexeyAB Please help, I used Tiny Yolov3, 6 anchors, 64 batch, 8 subdivision. 200 images in windows without GPU, I have no idea why Avg become -Nan after 30th or 40th iteration. I relabeled image and redownload repo, it still has same issue. Thank you in advance, Sir. Here is some details:
num_of_clusters = 6, width = 416, height = 416 read labels from 200 images loaded image: 200 box: 212 all loaded. calculating k-means++ ... iterations = 13 avg IoU = 78.85 % Saving anchors to the file: anchors.txt anchors = 101,167, 158,198, 149,307, 215,229, 327,254, 249,341
@AlexeyAB which repository should we use for tiny yolov3 ?
I am training the custom data with yolov3.cfg with the commend
darknet.exe detector train data/obj.data cfg/yolo-obj.cfg darknet53.conv.74 -mjpeg_port 8090 -map
but the mAP is very low. is there any problem I need to fix?
cfg:
batch=64
subdivisions=64
width=416
height=416
How do you calculate the anchor for Tiny YoloV3?, your help please.
@intelltech change the anchors since tiny yolo has only 6 anchors !../../.././darknet detector calc_anchors data/obj.data -num_of_clusters 6 -width 640-height 640
Okay. Thank you. But in my YoloV3-Tiny.cfg it is configured as: 416x416 (which I also use to train) because 640x640?
@joelmatt Okay. Thank you. But in my YoloV3-Tiny.cfg it is configured as: 416x416 (which I also use to train) because 640x640?
@AlexeyAB
What batch= and subdivision= did you use for training 416x416 and 608x608?
416x416: batch=64 subdivisions=32 608x608 batch=64 subdivisions=64
This values are chosen based on the maximum i achieve without CUDA error
But for your small objects - the higher resolution should give more advantages than higher minibatch.
It makes sense to me. In tiny Yolo there is no big difference between different resolutions
Neural Network Input Resolution Weights Iterations Avg loss average IoU(%) mAP (%) Tiny Yolo V3 416x416 yolov3-tiny-VM.cfg 42000 0.1983 45.59% 61.18% Tiny Yolo V3 608x608 yolov3-tiny-VM_608.cfg 21700 0.3469 46.39% 61.29% Tiny Yolo V3 832x832 yolov3-tiny-VM_832.cfg 55200 0.2311 48.69% 56.77%
Did you train on training dataset(700 images) and did you calculate mAP on testing dataset (another 300 images)? It is recommended to use at least 2000 images per class, so perhaps you met overfitting, therefore you see mAP decreasing.
Yes
num_of_clusters = 9, width = 608, height = 608 read labels from 646 images loaded image: 646 box: 1232 all loaded.
calculating k-means++ ... avg IoU = 85.78 %
Saving anchors to the file: anchors.txt anchors = 22.9345,25.3790, 24.4016,33.8224, 31.9940,28.3958, 26.0843,42.4427, 32.3810,36.6355, 25.8669,54.7409, 42.4861,34.3341, 36.3457,44.5924, 45.8934,53.9618
num_of_clusters = 9, width = 608, height = 608 read labels from 301 images loaded image: 289 box: 557 all loaded.
calculating k-means++ ... avg IoU = 86.85 %
Saving anchors to the file: anchors.txt anchors = 4.2752,3.3777, 22.6959,33.0490, 23.9492,41.1689, 27.6245,48.9778, 34.1149,40.1174, 26.7869,57.7545, 35.4523,50.2061, 35.8496,58.8041, 32.4750,70.3333
Hello, @ssanzr. What is the number of classes that you are using in this test? I'm asking you because I'm facing the similar question on my experiments, but in my case I'm just using one class. I have similar results to full YOLO and Tiny YOLO and, in some cases, Tiny YOLO has better results.
Weights only save every 100 iterations until 900, then saves every 10,000. Read more here: https://github.com/pjreddie/darknet/issues/190
@william91108
did you manage to solve your issue I am facing the same issue and I don't know how to go about solving this.
Hello @AlexeyAB, I am facing the following issue that for my custom dataset the avg loss is going down but the mAP is still 0. I have ~ 12000 train images from with 7000 have defects I want to identify. I have 4 classes of defects I am training for, I have around 5000 images in the test set.
I am training on EC2 with 4 GPUs. I have first trained for 1000 iterations as you suggested on one GPU and now I am training for 8000 iterations on all 4 GPUs. The avg loss is going down but mAP is still 0, and I don't know what I should check.
All my images are 1600 x 256 and I have kept them that way. I have modified anchors to used calculated anchors based on my datasets anchors = 20, 254, 31, 254, 70, 253, 27,103, 25, 39, 16, 18, 56, 56, 155, 89, 434, 182
In cfg file I have Batch = 64 subdivisions = 16 and appart from doing the suggested modifications when training on own dataset, I have only played with max_batches.
I have check with -show_imgs that bounding boxes were showing properly
Do you have any suggestions on why the mAP is showing 0.
Regards
Hi Everyone, I have been training different YOLO networks for my custom dataset following the repository from @AlexeyAB and i am quite puzzled about the performance obtained for each network.
I am using exactly the same testing and training .png dataset for every network. I have 700 training images and 300 for testing.
Performance summary for the different networks:
Why when using yolo V3 the performance is much worst than when using tiny yolo V3?
Why the input resolution does not play a role in the tiny yolo performance while it has a high impact in Yolo V3?
Anyone any idea?
Thanks