Open harish-khollam opened 6 years ago
@harish-khollam Hi,
tiny-yolo-voc.cfg
instead of yolo-voc.2.0.cfg
.
and then train your model, with
hight: 640
width: 640
random: 1
threshold: 0.6
P.S. I edit post - don't change maxpool layers in the tiny-yolo-voc.cfg
Or you can base your cfg-file on this:
I used tiny-yolo-voc.cfg
Which looks like this.
[net]
batch=64
subdivisions=8
width=640
height=640
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1
learning_rate=0.01
max_batches = 100000
policy=steps
steps=100,1000,20000,30000
scales=.1,10,.1,.1
[convolutional]
batch_normalize=1
filters=16
size=3
stride=1
pad=1
activation=leaky
[maxpool]
size=2
stride=2
[convolutional]
batch_normalize=1
filters=32
size=3
stride=1
pad=1
activation=leaky
[maxpool]
size=2
stride=2
[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky
[maxpool]
size=2
stride=2
[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky
[maxpool]
size=2
stride=2
[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky
[maxpool]
size=2
stride=2
[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky
[maxpool]
size=2
stride=1
[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky
###########
[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky
[convolutional]
size=1
stride=1
pad=1
filters=30
activation=linear
[region]
anchors = 0.38,0.38, 0.63,0.63, 0.93,0.93, 1.61,1.61, 3.33,3.33
bias_match=1
classes=1
coords=4
num=5
softmax=1
jitter=.2
rescore=1
object_scale=5
noobject_scale=1
class_scale=1
coord_scale=1
absolute=1
thresh = .6
random=1
Try to increase width=832 height=832, can you see more pipes? I tried it but then it doesn't detect any thing.
Following are few sample images of size range i am trying to work on. File Size: 2422 × 2422.
File Size: 1726 × 1727.
File Size: 1008 × 1008.
Did you have images in the training dataset, which contain more than 30 labeled pipes? Yes all these Images where labeled, you can see that every single pipe was labeled and there where more images like these.
It would be great if you can help me tackling this issue. i am stuck on this issue for a very long time, i even tried more iterations but the Average loss wont go below 11.
@harish-khollam
I have done some fixes in the code, try to update your code from this repo.
Then add param max=300
in the [region]
layer in your cfg-file:
[region]
anchors = 0.38,0.38, 0.63,0.63, 0.93,0.93, 1.61,1.61, 3.33,3.33
bias_match=1
classes=1
coords=4
num=5
softmax=1
jitter=.2
rescore=1
max=300
object_scale=5
noobject_scale=1
class_scale=1
coord_scale=1
absolute=1
thresh = .6
random=1
Then set width=832
and height=832
and train your model.
If previous fixes will works well, then try to train this model: tiny_yolo_custom.zip I'v done there some fixes.
@AlexeyAB I made all the changes as you suggested, This time I was able to reduce the average loss (Before it never went lower than 11 but after these modifications, it reached to 2.4). but as soon as it crosses 100 Iteration; the average loss started increasing & ultimately it started generating -NAN within next 2 iterations.
Few of the thread suggested that it may be an issue with the Labeling, so I removed all the labels with negative value (Crossing the image resolution bounds ) & re-ran the model. even after these changes it generated -NAN at 101 iterations. Tried this with 832x832 as well as 640x640 in .cfg.
For an experiment purpose, i used the same tiny_yolo_custom.cfg with tiny-yolo-voc.weights but the training started with 40100 iterations. it didn't generate any NA till now. I don't know if using tiny-yolo-voc.weights is the right choice over darknet19_448.conv.23
What do you suggest? How can I train the model without generating NAN by using right .weight file? What may be the reason of getting -NAN.
@harish-khollam
The most correct weights file for tiny-yolo is tiny-yolo-voc.conv.13
, that you can get by this command: darknet.exe partial tiny-yolo-voc.cfg tiny-yolo-voc.weights tiny-yolo-voc.conv.13 13
If it doesn't solve Nan problem, then also remove increasing of learning rate. Change steps and scales in your cfg file to this:
steps=1000,4000
scales=.1,.1
If it doesn't solve Nan problem, then check that your labels doesn't equal to 0 or 1, should be 0<val<1
Also what OS do you use: Windows or Linux for training?
Nan during training can be due:
Thanks for your guidance @AlexeyAB.
Hello @AlexeyAB Suggestions & modification prescribed by you worked very well, I got a very good accuracy than the previous model.
Thanks a lot for this. And its still getting better.
But I am facing some difficulty in porting this model to the mobile.
while running in Darkflow I get the following error
AssertionError: Over-read repo/tiny_yolo_custom_8900.weights
when I tried to search for this error it says that there is a mismatch between CFG & the weights.
I wonder why this would happen. I want to test this on my iPhone. I am using the same code which was working for the previous model trained from darknet19_448.
@harish-khollam looks great! Could you share the modifications and cfg file you are using? Also how big is your dataset?
@harish-khollam
What command do you use to run Darkflow with your tiny_yolo_custom model?
Did you train your model using this repository or original repository? https://github.com/pjreddie/darknet
Can you successfully use default yolo-voc.weights downloaded from pjreddie site https://pjreddie.com/media/files/yolo-voc.weights (not from googledrive) with Darkflow?
If you can use https://pjreddie.com/media/files/yolo-voc.weights then:
fwrite(net.seen, sizeof(size_t), 1, fp);
int minor = 1;
max_batches=8900
tiny_yolo_custom_8900.weights
, and training will be finished immidiately with creating of file /backup/tiny_yolo_custom_final.weights
that use in the DarkflowAs I see this is a problem with Darkflow that can't work with original Darknet weights: https://github.com/thtrieu/darkflow/issues/372#issuecomment-322009000
Hi, try it again, but follow instructions in readme and use weights uploaded on google drive only, the ones from darknet are not working with darkflow cfgs now.
Also if you want to use Yolo on mobile devices, you can try to use Yolo from dnn-module that built-in OpenCV (>=3.4.0 version) for Android or iOS https://opencv.org/releases.html Examples:
C++: https://github.com/opencv/opencv/blob/master/samples/dnn/yolo_object_detection.cpp
Python: https://github.com/the-house-of-black-and-white/opencv-dnn-demo/blob/master/app.py
Darkflow uses TensorFlow, that as I see doesn't use GPU on iOS yet, and OpenCV-dnn-Yolo doesn't use GPU too, but highly optimized for CPU, so OpenCV-yolo can be faster than Darkflow.
http://machinethink.net/blog/tensorflow-on-ios/
Limitations of TensorFlow on iOS:
Currently there is no GPU support. TensorFlow does use the Accelerate framework for taking advantage of CPU vector instructions, but when it comes to raw speed you can’t beat Metal.
@AlexeyAB I'm training on a P2.xlarge aws server, with 11439 MiB available memory (Tesla k80), I've tried to use your custom cfg (most recent linked), but I get the CUDA ERROR: out of memory
error when resizing to 896x896, it's funny because it initialised to 1024x1024 and it trained fine. I believe I'm using the correct tiny-yolo-voc.conv.13.
My image size is 2208x1242
Also to add, I have a laptop with 4gb memory (m1200), I get out of memory even when setting batch and subdivison to 1. Is the network you linked just really heavy? I don't have any issues regarding original tiny-yolo-voc.cfg.
My cfg:
[net]
batch=64
subdivisions=16
width=832
height=832
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1
learning_rate=0.0001
max_batches = 45000
policy=steps
steps=100,1000,4000
scales=10,.1,.1
[convolutional]
batch_normalize=1
filters=32
size=3
stride=1
pad=1
activation=leaky
[maxpool]
size=2
stride=2
[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky
[maxpool]
size=2
stride=2
[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky
[maxpool]
size=2
stride=2
[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky
[maxpool]
size=2
stride=2
[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky
###########
[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=leaky
[convolutional]
size=1
stride=1
pad=1
filters=30
activation=linear
[region]
anchors = 0.47,0.77, 0.55,0.98, 0.70,1.15, 0.74,0.79, 1.06,1.04
bias_match=1
classes=1
coords=4
num=5
softmax=1
jitter=.2
rescore=1
small_object=1
max=400
object_scale=5
noobject_scale=1
class_scale=1
coord_scale=1
absolute=1
thresh = .6
random=1
Start of training:
./darknet detector train blip_both.data tiny_yolo_custom_alexey.cfg tiny-yolo-vo
c.conv.13
tiny_yolo_custom_alexey
layer filters size input output
0 conv 32 3 x 3 / 1 832 x 832 x 3 -> 832 x 832 x 32
1 max 2 x 2 / 2 832 x 832 x 32 -> 416 x 416 x 32
2 conv 64 3 x 3 / 1 416 x 416 x 32 -> 416 x 416 x 64
3 max 2 x 2 / 2 416 x 416 x 64 -> 208 x 208 x 64
4 conv 128 3 x 3 / 1 208 x 208 x 64 -> 208 x 208 x 128
5 conv 64 1 x 1 / 1 208 x 208 x 128 -> 208 x 208 x 64
6 conv 128 3 x 3 / 1 208 x 208 x 64 -> 208 x 208 x 128
7 max 2 x 2 / 2 208 x 208 x 128 -> 104 x 104 x 128
8 conv 256 3 x 3 / 1 104 x 104 x 128 -> 104 x 104 x 256
9 conv 128 1 x 1 / 1 104 x 104 x 256 -> 104 x 104 x 128
10 conv 256 3 x 3 / 1 104 x 104 x 128 -> 104 x 104 x 256
11 max 2 x 2 / 2 104 x 104 x 256 -> 52 x 52 x 256
12 conv 512 3 x 3 / 1 52 x 52 x 256 -> 52 x 52 x 512
13 conv 512 3 x 3 / 1 52 x 52 x 512 -> 52 x 52 x 512
14 conv 30 1 x 1 / 1 52 x 52 x 512 -> 52 x 52 x 30
15 detection
Loading weights from tiny-yolo-voc.conv.13...
seen 64
Done!
Learning Rate: 0.0001, Momentum: 0.9, Decay: 0.0005
Resizing
1024
Loaded: 45.075714 seconds
Region Avg IOU: 0.079985, Class: 1.000000, Obj: 0.498549, No Obj: 0.500154, Avg Recall: 0.000000, count: 88
Region Avg IOU: 0.090546, Class: 1.000000, Obj: 0.498464, No Obj: 0.500142, Avg Recall: 0.000000, count: 60
Region Avg IOU: 0.120469, Class: 1.000000, Obj: 0.498516, No Obj: 0.500144, Avg Recall: 0.000000, count: 76
End of training due to error:
9: 77.087448, 273.408600 avg, 0.000100 rate, 68.094284 seconds, 576 images
Loaded: 0.000052 seconds
Region Avg IOU: 0.131572, Class: 1.000000, Obj: 0.165749, No Obj: 0.167225, Avg Recall: 0.011765, count: 85
Region Avg IOU: 0.140189, Class: 1.000000, Obj: 0.165928, No Obj: 0.167228, Avg Recall: 0.000000, count: 44
Region Avg IOU: 0.142925, Class: 1.000000, Obj: 0.165996, No Obj: 0.167226, Avg Recall: 0.025316, count: 79
Region Avg IOU: 0.103536, Class: 1.000000, Obj: 0.166065, No Obj: 0.167232, Avg Recall: 0.000000, count: 24
Region Avg IOU: 0.113124, Class: 1.000000, Obj: 0.166023, No Obj: 0.167230, Avg Recall: 0.000000, count: 38
Region Avg IOU: 0.140396, Class: 1.000000, Obj: 0.164977, No Obj: 0.167221, Avg Recall: 0.000000, count: 73
Region Avg IOU: 0.131264, Class: 1.000000, Obj: 0.165162, No Obj: 0.167234, Avg Recall: 0.000000, count: 134
Region Avg IOU: 0.149730, Class: 1.000000, Obj: 0.165761, No Obj: 0.167245, Avg Recall: 0.000000, count: 94
Region Avg IOU: 0.148957, Class: 1.000000, Obj: 0.165370, No Obj: 0.167223, Avg Recall: 0.000000, count: 68
Region Avg IOU: 0.130500, Class: 1.000000, Obj: 0.165454, No Obj: 0.167231, Avg Recall: 0.000000, count: 78
Region Avg IOU: 0.126774, Class: 1.000000, Obj: 0.165550, No Obj: 0.167231, Avg Recall: 0.000000, count: 110
Region Avg IOU: 0.125435, Class: 1.000000, Obj: 0.165480, No Obj: 0.167242, Avg Recall: 0.000000, count: 39
Region Avg IOU: 0.113545, Class: 1.000000, Obj: 0.165656, No Obj: 0.167226, Avg Recall: 0.000000, count: 118
Region Avg IOU: 0.122508, Class: 1.000000, Obj: 0.165301, No Obj: 0.167229, Avg Recall: 0.000000, count: 70
Region Avg IOU: 0.134069, Class: 1.000000, Obj: 0.165200, No Obj: 0.167213, Avg Recall: 0.000000, count: 73
Region Avg IOU: 0.129905, Class: 1.000000, Obj: 0.165304, No Obj: 0.167223, Avg Recall: 0.000000, count: 65
10: 57.855984, 251.853333 avg, 0.000100 rate, 67.484894 seconds, 640 images
Resizing
896
CUDA Error: out of memory
darknet: ./src/cuda.c:36: check_error: Assertion `0' failed.
Aborted (core dumped)
@TheMikeyR Yes, this model tiny_yolo_custom.zip ~2-3 times more expensive than default tiny-yolo-voc.cfg
. This model can detect 4x more objects on each image than tiny-yolo-voc.
I did some fixes, try to update your code from this repo and train again with random=1.
@AlexeyAB Thanks for the quick update. I've just reinitialised the training, I will report back, it didn't crash on resize to 800 and 704, so far so good.
@AlexeyAB It chrashed on 928, I've included the log. I'm watching the nvidia-smi meanwhile, and it peaks on 4980MiB / 11439MiB when it chrashed. It used 4003MiB / 11439MiB of the memory on size 704. It just seems odd it will max out the gpu on resolution 928?
Region Avg IOU: 0.265726, Class: 1.000000, Obj: 0.071476, No Obj: 0.071878, Avg Recall: 0.035714, count: 56
20: 15.578996, 76.250389 avg, 0.000100 rate, 37.661903 seconds, 1280 images
Resizing
928
0: layer = 0, 1: layer = 3, 2: layer = 0, 3: layer = 3, 4: layer = 0, 5: layer = 0, 6: layer = 0, 7: layer = 3, 8: layer = 0, 9: layer = 0, 10: layer = 0, 11: layer = 3, 12: layer = 0, 13: layer = 0, 14: layer = 0, 15: layer = 21,CUDA Error: out of memory
darknet: ./src/cuda.c:36: check_error: Assertion `0' failed.
Aborted (core dumped)
@TheMikeyR Ok, I just added another one fix.
@AlexeyAB the commit https://github.com/AlexeyAB/darknet/commit/e4ab47dfcedb4c87e5eddf484caa4ac0c020fc9b didn't work, it crashes at 768
Resizing
768
0: layer = 0, 1: layer = 3, 2: layer = 0, 3: layer = 3, 4: layer = 0, 5: layer = 0, 6: layer = 0, 7: layer = 3, 8: layer = 0, 9: layer = 0, 10: layer = 0, 11: layer = 3, 12: layer = 0, 13: layer = 0, 14: layer = 0, 15: layer = 21, try to allocate workspace, CUDA Error: out of memory
I've tried to modify default tiny-yolo-voc.cfg with same resolution, max=400, small_object=1, and there are no issues in that.
The cfg that works fine is:
[net]
batch=64
subdivisions=16
width=816
height=816
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1
learning_rate=0.0001
max_batches = 40200
policy=steps
steps=-1,100,20000,30000
scales=.1,10,.1,.1
[convolutional]
batch_normalize=1
filters=16
size=3
stride=1
pad=1
activation=leaky
[maxpool]
size=2
stride=2
[convolutional]
batch_normalize=1
filters=32
size=3
stride=1
pad=1
activation=leaky
[maxpool]
size=2
stride=2
[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky
[maxpool]
size=2
stride=2
[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky
[maxpool]
size=2
stride=2
[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky
[maxpool]
size=2
stride=2
[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky
[maxpool]
size=2
stride=1
[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky
###########
[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky
[convolutional]
size=1
stride=1
pad=1
filters=30
activation=linear
[region]
anchors = 0.47,0.77, 0.55,0.98, 0.70,1.15, 0.74,0.79, 1.06,1.04
bias_match=1
classes=1
coords=4
num=5
softmax=1
jitter=.2
rescore=1
small_object=1
max=400
object_scale=5
noobject_scale=1
class_scale=1
coord_scale=1
absolute=1
thresh = .6
random=1
I'm trying to replicate what @harish-khollam managed to do on images in size 2208x1242 (all of my pictures), I have the same issue with detecting objects close to each other. For me it makes an average box in between the objects when that happens, even though my ground truth labeling is seperating without any issues. I think the k80 with 12gb memory is not strong enough for the cfg you've uploaded, do you know how much is required? It seems to chrash when it's above 704 in size. The odd thing is that I never see it max out my memory, but yet again it is only reported every 1 second.
Edit: If the custom cfg is 2-3 times more expensive, it should still fit, The cfg I posted use 3396MiB / 11439MiB with size 896.
Edit2: Nevermind it just resized itself to 768 and is now using 10389MiB / 11439MiB, I can't seem to figure out a reason for this?
Edit3: At 864 it goes to 2065MiB / 11439MiB.
@AlexeyAB I've posted some updates to my last comment which might help debugging, please let me know if you need any additional info or help from me.
@TheMikeyR So if you set width=768 height=768 random=0
then memory using is 10389MiB / 11439MiB. But if you set width=864 height=864 random=0
then memory using is 2065MiB / 11439MiB, isn't it?
@AlexeyAB Correct, I've just tested with the cfg I posted above. If I do the same test with the custom cfg, it will run with 864, but crash with 768 due to out of memory.
@TheMikeyR Can you try this with CUDNN=0
in the Makefile?
CUDNN=0
gives 3667MiB / 11439MiB on 768, should I just continue training with CUDNN=0
?
Edit: I'm using CUDNN v5.0 and CUDA 8.0
@TheMikeyR
Try to change these 3 lines each to this: if (s > most) { most = s; printf(" most = %zu ", most); }
(this will show what layer and what algorithm takes too much GPU-RAM):
So this problem is here - for some sizes of layer the cuDNN takes too much GPU-RAM due to alignment for maximum performance: https://github.com/AlexeyAB/darknet/blob/e4ab47dfcedb4c87e5eddf484caa4ac0c020fc9b/src/convolutional_layer.c#L106-L132
I will check, may be I should add code for switching to another not the fastest algorithm for some sizes of layers, for example to use CUDNN_CONVOLUTION_BWD_DATA_NO_WORKSPACE
or CUDNN_CONVOLUTION_BWD_DATA_SPECIFY_WORKSPACE
: https://github.com/AlexeyAB/darknet/blob/e4ab47dfcedb4c87e5eddf484caa4ac0c020fc9b/src/convolutional_layer.c#L162-L177
Here is logs from the different sizes ordered from small to big
No error
Resizing
672
0: layer = 0, 1: layer = 3, 2: layer = 0, 3: layer = 3, 4: layer = 0, most = 2304 most = 42680320 5: layer = 0, 6: layer = 0, most = 2304 most = 42680320 7: layer = 3, 8: layer = 0, most = 4431413248 9: layer = 0, most = 1024 10: layer = 0, most = 4431413248 11: layer = 3, 12: layer = 0, most = 9216 13: layer = 0, most = 18432 14: layer = 0, most = 2048 15: layer = 21, try to allocate workspace, CUDA allocate done!
Error
Loading weights from tiny-yolo-voc.conv.13... most = 2304 most = 42680320 most = 2304 most = 42680320 most = 4608 most = 157204480 most = 1024 most = 4608 most = 157204480 most = 9216 most = 18432 most = 2048
seen 64
Done!
Learning Rate: 0.0001, Momentum: 0.9, Decay: 0.0005
Resizing
736
0: layer = 0, 1: layer = 3, 2: layer = 0, 3: layer = 3, 4: layer = 0, most = 2304 most = 42680320 5: layer = 0, 6: layer = 0, most = 2304 most = 42680320 7: layer = 3, 8: layer = 0, most = 4608 9: layer = 0, most = 1024 10: layer = 0, most = 4608 11: layer = 3, 12: layer = 0, most = 9216 most = 4503109632 13: layer = 0, most = 18432 most = 8937013248 14: layer = 0, most = 2048 15: layer = 21, try to allocate workspace, CUDA Error: out of memory
No error
Resizing
800
0: layer = 0, 1: layer = 3, 2: layer = 0, 3: layer = 3, 4: layer = 0, most = 2304 most = 42680320 5: layer = 0, 6: layer = 0, most = 2304 most = 42680320 7: layer = 3, 8: layer = 0, most = 4608 most = 157204480 9: layer = 0, most = 1024 10: layer = 0, most = 4608 most = 157204480 11: layer = 3, 12: layer = 0, most = 9216 13: layer = 0, most = 18432 14: layer = 0, most = 2048 15: layer = 21, try to allocate workspace, CUDA allocate done!
No error
Resizing
832
0: layer = 0, 1: layer = 3, 2: layer = 0, 3: layer = 3, 4: layer = 0, most = 2304 most = 42680320 5: layer = 0, 6: layer = 0, most = 2304 most = 42680320 7: layer = 3, 8: layer = 0, most = 4608 most = 157204480 9: layer = 0, most = 1024 10: layer = 0, most = 4608 most = 157204480 11: layer = 3, 12: layer = 0, most = 9216 13: layer = 0, most = 18432 14: layer = 0, most = 2048 15: layer = 21, try to allocate workspace, CUDA allocate done!
Error
Resizing
864
0: layer = 0, 1: layer = 3, 2: layer = 0, 3: layer = 3, 4: layer = 0, most = 2304 most = 42680320 5: layer = 0, 6: layer = 0, most = 2304 most = 42680320 7: layer = 3, 8: layer = 0, most = 4431413248 most = 4499570688 9: layer = 0, most = 1024 10: layer = 0, most = 4431413248 most = 4499570688 11: layer = 3, 12: layer = 0, most = 9216 13: layer = 0, most = 8937013248 14: layer = 0, 15: layer = 21, try to allocate workspace, CUDA Error: out of memory
darknet: ./src/cuda.c:36: check_error: Assertion `0' failed.
Error
Resizing
896
0: layer = 0, 1: layer = 3, 2: layer = 0, 3: layer = 3, 4: layer = 0, most = 2304 most = 42680320 5: layer = 0, 6: layer = 0, most = 2304 most = 42680320 7: layer = 3, 8: layer = 0, most = 4431413248 most = 4499570688 9: layer = 0, most = 1024 10: layer = 0, most = 4431413248 most = 4499570688 11: layer = 3, 12: layer = 0, most = 9216 13: layer = 0, most = 8937013248 14: layer = 0, 15: layer = 21, try to allocate workspace, CUDA Error: out of memory
Error
Resizing
928
0: layer = 0, 1: layer = 3, 2: layer = 0, 3: layer = 3, 4: layer = 0, most = 2304 most = 42680320 5: layer = 0, 6: layer = 0, most = 2304 most = 42680320 7: layer = 3, 8: layer = 0, most = 4431413248 most = 4499570688 9: layer = 0, most = 1024 10: layer = 0, most = 4431413248 most = 4499570688 11: layer = 3, 12: layer = 0, most = 9216 13: layer = 0, most = 8937013248 14: layer = 0, 15: layer = 21, try to allocate workspace, CUDA Error: out of memory
No error
960
0: layer = 0, 1: layer = 3, 2: layer = 0, 3: layer = 3, 4: layer = 0, most = 2304 most = 42680320 5: layer = 0, 6: layer = 0, most = 2304 most = 42680320 7: layer = 3, 8: layer = 0, most = 4608 most = 157204480 9: layer = 0, most = 1024 10: layer = 0, most = 4608 most = 157204480 11: layer = 3, 12: layer = 0, most = 9216 13: layer = 0, most = 18432 14: layer = 0, most = 2048 15: layer = 21, try to allocate workspace, CUDA allocate done!
No error
Resizing
992
0: layer = 0, 1: layer = 3, 2: layer = 0, 3: layer = 3, 4: layer = 0, most = 2304 most = 42680320 5: layer = 0, 6: layer = 0, most = 2304 most = 42680320 7: layer = 3, 8: layer = 0, most = 4608 most = 157204480 9: layer = 0, most = 1024 10: layer = 0, most = 4608 most = 157204480 11: layer = 3, 12: layer = 0, most = 9216 13: layer = 0, most = 18432 14: layer = 0, most = 2048 15: layer = 21, try to allocate workspace, CUDA allocate done!
No error
Resizing
1024
0: layer = 0, 1: layer = 3, 2: layer = 0, 3: layer = 3, 4: layer = 0, most = 2304 most = 42680320 5: layer = 0, 6: layer = 0, most = 2304 most = 42680320 7: layer = 3, 8: layer = 0, most = 4608 most = 157204480 9: layer = 0, most = 1024 10: layer = 0, most = 4608 most = 157204480 11: layer = 3, 12: layer = 0, most = 9216 13: layer = 0, most = 18432 14: layer = 0, most = 2048 15: layer = 21, try to allocate workspace, CUDA allocate done!
@TheMikeyR Thanks! I added some fixes for cuDNN, update code and try again.
Thanks @AlexeyAB it seems to work now, at least when it gets to a resolution where it chrashed before it prints out that it is using the slow cudnn algo without workspace.
I can see I'm running out of memory at 1056, so I reduced the scale steps from 10 to 6 steps, which I assume it can maximum go to 832 +- (32*6) in resolution (640 min and 1024 max), please correct me if I'm wrong. The parameter is set like this scales=6,.1,.1
.
Thanks for your help, I will let a training go for some time and return back if it helps with my issue regarding detecting objects close to each other. If you want me to run some tests on a linux system please let me know.
@TheMikeyR Thanks for the tests.
random=1
is hardcoded here: https://github.com/AlexeyAB/darknet/blob/033e934ce82826c73d851098baf7ce4b1a27c89a/src/detector.c#L102scales=6,.1,.1
in the cfg-file is related to the changing of learning_rate
: https://github.com/AlexeyAB/darknet/issues/279#issuecomment-347002399@AlexeyAB ah okay, thanks! I can't seem to figure out the formula rand() = integer between 0 and 32767 % = modulo init_w = 832 (in custom cfg case).
I can't seem to get it to produce anything higher than 992, but I've seen it go up to 1056 (where my aws server runs out of memory, which I would like to prevent, but still want to extra advantage of random=1
) do you have any suggestions on how to limit the resolution jumps?
Again thanks a lot for your time and effort!
@TheMikeyR int dim = (rand() % 12 + (init_w/32 - 5)) * 32;
rand()
= [0 - 32767]rand()%12
= [0 - 11](init_w/32 - 5)
= 21(rand() % 12 + (init_w/32 - 5)) * 32;
= ([0 - 11] + 21)32 = [21 - 32]32 = [672 - 1024]I can't seem to get it to produce anything higher than 992, but I've seen it go up to 1056
Check that you set correct values mulptile of 32, width=832 height=832 in the cfg.
Thanks for the detailed description, it is set to 832 in the config currently, so hopefully it shouldn't go above 1024 and crash. I will let it run and report back tomorrow.
Hi @harish-khollam, Do you mind sharing your .cfg and weights files? I need to detect fire/smoke in my images and my labeling is quite similar to yours.
@AlexeyAB you have used max in region layer but what is max?
@abdulkalam1233 max=300
in the [region]
-layer is the maximum number of truths (bounded boxes) that will be used from one txt-label-file. The rest truth boxes will be rejected.
@AlexeyAB that's great. Man i need your help for my application can i have your skype id mail me on kalama449@gmail.com
Hello @AlexeyAB, I have successfully built a Pipe detection model using YOLO2 on high resolution images.
cfg file consists following parameters: hight: 640 width: 640 random: 1 threshold: 0.6
The model is doing pretty good on large pipes with 100 % accuracy. But if i test it on cluster of small pipes the accuracy decreases. As an experiment i masked certain portions of the image and used for test. The model did pretty well as compared to unmasked.
Following are two images:
Unmasked As you see in the following image the model is able to detect only 1 pipe in the middle cluster.
Masked As soon as i blur the surrounding area ; it is able to detect more pipes.
Similar phenomenon is observed when different area of same image is masked. Why does the model perform poorly when exposed to full image as compared to exposing a part of image ? Are there some parameter i can tweak for such use case.