Open zhao-haha opened 6 years ago
Besides, i tried to use the 'trained' model to detect hands, No hand is detected without surprise
Hi,
you should use darknet.exe detector train data/obj.data tiny-yolo-hand.cfg darknet19_448.conv.23
instead of darknet.exe detector train data/obj.data tiny-yolo-hand.cfg tiny-yolo.weights
Thanks for your reply! I tried to use yolo2.0.cfg to train darknet19_448.conv.23 and this problem was solved, however, the out put of training seems wrong:
Here is my custom yolo2.0.cfg(set classes to 1 and filters to 30 in the end):
[net] batch=1 subdivisions=1 width=416 height=416 channels=3 momentum=0.9 decay=0.0005 angle=0 saturation = 1.5 exposure = 1.5 hue=.1
learning_rate=0.001 max_batches = 120000 policy=steps steps=-1,100,80000,100000 scales=.1,10,.1,.1
[convolutional] batch_normalize=1 filters=32 size=3 stride=1 pad=1 activation=leaky
[maxpool] size=2 stride=2
[convolutional] batch_normalize=1 filters=64 size=3 stride=1 pad=1 activation=leaky
[maxpool] size=2 stride=2
[convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=leaky
[convolutional] batch_normalize=1 filters=64 size=1 stride=1 pad=1 activation=leaky
[convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=leaky
[maxpool] size=2 stride=2
[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky
[convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=leaky
[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky
[maxpool] size=2 stride=2
[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky
[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky
[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky
[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky
[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky
[maxpool] size=2 stride=2
[convolutional] batch_normalize=1 filters=1024 size=3 stride=1 pad=1 activation=leaky
[convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=leaky
[convolutional] batch_normalize=1 filters=1024 size=3 stride=1 pad=1 activation=leaky
[convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=leaky
[convolutional] batch_normalize=1 filters=1024 size=3 stride=1 pad=1 activation=leaky
#######
[convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=1024 activation=leaky
[convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=1024 activation=leaky
[route] layers=-9
[reorg] stride=2
[route] layers=-1,-3
[convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=1024 activation=leaky
[convolutional] size=1 stride=1 pad=1 filters=30 activation=linear
[region] anchors = 0.738768,0.874946, 2.42204,2.65704, 4.30971,7.04493, 10.246,4.59428, 12.6868,11.8741 bias_match=1 classes=1 coords=4 num=5 softmax=1 jitter=.2 rescore=1
object_scale=5 noobject_scale=1 class_scale=1 coord_scale=1
absolute=1 thresh = .6 random=0
@ZhaoWangFu It seems all right in training output. You should train about 2000 iterations or more.
ok, thanks, I will try it
one more question, should i change the width and height in cfg because input image is 1280*720? should All image be the same size? Or do i have to resize the 1280x720 image to smaller size for faster speed? Thanks a lot!
@ZhaoWangFu No, you should not change anything.
sorry, i paste a wrong image, why are the values are all Nan?
Util i stopped this process, its all Nan values, except for the first few lines
@AlexeyAB Was the 'Nan' values problem occured just because of insufficient training or something else?
I removed samples which does not have annotations in my dataset, and things goes better, However, There are still a few nans with count == 0, like this:
Region Avg IOU: -nan(ind), Class: -nan(ind), Obj: -nan(ind), No Obj: 0.010644, Avg Recall: -nan(ind), count: 0 9693: 0.002723, 0.436156 avg, 0.001000 rate, 0.240000 seconds, 9693 images
Is it possible to see which image the darknet is using when training? Then i can remove it from my dataset.
@ZhaoWangFu Hi , Nice to meet you, I got a few problems that I supposed you know the solutions . Ah , Would you mind if I have you contact information or could you contact me by QQ(398912742) or email( ethan.penx@gmail.com) ? Thanks a million
I removed samples which does not have annotations in my dataset, and things goes better, However, There are still a few nans with count == 0, like this:
darknet detector recal ...
? https://github.com/AlexeyAB/darknet#when-should-i-stop-training@AlexeyAB Yes,Weights for 3000 iterations works fine, And the average loss error is about 0.05.
@ZhaoWangFu Hello, I am training a yolo model using the same dataset(egohand) with you.I just a beginer, so could I ask you for some help! "Weights for 3000 iterations works fine" -- Is the custom yolo2.0.cfg you posted above used for train?
@YMelon Yes,exactly the same. Before training, I picked some ‘Good’ samples from Egohand dataset because the quality of training samples is important.
@ZhaoWangFu Thanks so much for you reply! And sorry for not reply to you currently, because I am picking imags from Egohand as you said util now. 'Good' samples means high resolution ? For example, followed imag should pick or not?
![exam-1][C:\Users\user\Desktop\ToolPool\frame_1370.jpg] ![exam-1][C:\Users\user\Desktop\ToolPool\frame_1470.jpg]
The meanning of 'Good' depends on your target, If you want to train a hand detector like i did, then i think it's important to have clear and complete hand in sample image
But i am not sure whether those samples which contains a small part of hand will affect the training
@ZhaoWangFu Yes, my target is the same with you! Your meaning is to pick images contained complete and clear hand, drop those contains a small part of hand and not clear hands
Yes,Exactly
Thanks a lot, I'll try it.
@ZhaoWangFu Hi, I'm back again! I'm sorry, things seems not good, so have to bother you again.I picked 1000 images for training(whether I droped too much?), cfg is the same with you, CPU runs for 4~5 days, but model not convergence even iterations for 20000 , Loss is ~230 dont decrease anymore.I use the trained weights for 2000, ..., 20000 too much output boxes.Results as followd
Region Avg IOU: 0.093700, Class: 1.000000, Obj: 0.208604, No Obj: 0.436096, Avg Recall: 0.000000, count: 2 22400: 234.039932, 236.237473 avg, 0.001000 rate, 25.045385 seconds, 22400 images Loaded: 0.000080 seconds Region Avg IOU: 0.240916, Class: 1.000000, Obj: 0.512090, No Obj: 0.438591, Avg Recall: 0.000000, count: 2 22401: 221.970322, 234.810760 avg, 0.001000 rate, 25.234085 seconds, 22401 images
@ZhaoWangFu @AlexeyAB Hi,the problem of loss large is not exist when I use windows darknet version((https://github.com/AlexeyAB/darknet). But another problem, while training the output Obj value is very small, and I use the trained weights of 2000 to detection, not box output . I used 1400 hand images for training, cfg file is the same as @ZhaoWangFu . Is there any wrong? Or just continue to train for more iteration? The output as followed: When start:
1: 12.637802, 12.637802 avg, 0.000100 rate, 40.974274 seconds, 1 images Loaded: 0.000056 seconds Region Avg IOU: 0.255450, Class: 1.000000, Obj: 0.312328, No Obj: 0.401508, Avg Recall: 0.000000, count: 2 2: 11.193381, 12.493361 avg, 0.000100 rate, 23.499844 seconds, 2 images Loaded: 0.000061 seconds Region Avg IOU: 0.547629, Class: 1.000000, Obj: 0.360418, No Obj: 0.314297, Avg Recall: 0.666667, count: 3 3: 6.136428, 11.857667 avg, 0.000100 rate, 23.538784 seconds, 3 images Loaded: 0.000059 seconds Region Avg IOU: 0.387025, Class: 1.000000, Obj: 0.228083, No Obj: 0.216744, Avg Recall: 0.000000, count: 3 5: 3.318567, 10.275932 avg, 0.000100 rate, 40.721462 seconds, 5 images Loaded: 0.000054 seconds Region Avg IOU: 0.371768, Class: 1.000000, Obj: 0.171249, No Obj: 0.092966, Avg Recall: 0.500000, count: 2 6:2.750375, 9.523376 avg, 0.000100 rate, 23.559980 seconds, 6 images Loaded: 0.000072 seconds Region Avg IOU: 0.321536, Class: 1.000000, Obj: 0.049954, No Obj: 0.058721, Avg Recall: 0.250000, count: 4 7:5.134910, 9.084530 avg, 0.000100 rate, 23.591419 seconds, 7 images Loaded: 0.000058 seconds Region Avg IOU: 0.460021, Class: 1.000000, Obj: 0.034833, No Obj: 0.039374, Avg Recall: 0.500000, count: 2 ...... After 2000 iteration 2971: 0.691351, 0.843138 avg, 0.001000 rate, 23.632906 seconds, 2971 images Loaded: 0.000057 seconds Region Avg IOU: 0.404327, Class: 1.000000, Obj: 0.030576, No Obj: 0.008422, Avg Recall: 0.500000, count: 2 2972: 1.040303, 0.862855 avg, 0.001000 rate, 23.536030 seconds, 2972 images Loaded: 0.000055 seconds Region Avg IOU: 0.696566, Class: 1.000000, Obj: 0.023101, No Obj: 0.009011, Avg Recall: 1.000000, count: 3 2973: 0.378374, 0.814407 avg, 0.001000 rate, 23.603308 seconds, 2973 images Loaded: 0.000065 seconds Region Avg IOU: 0.419630, Class: 1.000000, Obj: 0.034383, No Obj: 0.008450, Avg Recall: 0.500000, count: 2 2974:0.762052, 0.809171 avg, 0.001000 rate, 23.911331 seconds, 2974 images Loaded: 0.000057 seconds Region Avg IOU: 0.487199, Class: 1.000000, Obj: 0.036471, No Obj: 0.008485, Avg Recall: 0.666667, count: 3
@YMelon
darknet detector test obj.data obj.cfg obj.weights -thresh 0.05
darknet detector map obj.data obj.cfg obj.weights
I want to train a hand detector using yolo2, so i cloned this repo and compiled it successfully on my PC(Win10 64bit, GTX1050, CUDA8.0, CUDNN6.0, Visual Studio 2015), Then i prepared labelled hand dataset (2000 images with resolution 1080x720) and split them as train(80%) and test(20%), i followed the instructions in README and use the darknet.exe to train the model, However, it stopped at once just after start the training! With no errors! Does it mean the training succedd?
Output what does "seen 64" mean?
Input I put all input under darknet/build/darknet/x64/data, just as below:
[obj] dir is training set, [test] dir is test set, it looks like:
labels in each txt file was converted to yolo format, for example:
I have only ONE classs: Hand, So my obj.names has only one line:
And my obj.data:
My train.txt and test.txt looks:
Models I want to perform real hand detection on my PC, so i choose tiny-yolo, i copyed the /cfg/tiny-yolo.cfg to /data/tiny-yolo-hand.cfg and set classes= 1, filters = 30. tiny-yolo.weights is downloaded from http://pjreddie.com/media/files/tiny-yolo.weights
`[net] batch=64 subdivisions=8 width=416 height=416 channels=3 momentum=0.9 decay=0.0005 angle=0 saturation = 1.5 exposure = 1.5 hue=.1
learning_rate=0.001 max_batches = 120000 policy=steps steps=-1,100,80000,100000 scales=.1,10,.1,.1
[convolutional] batch_normalize=1 filters=16 size=3 stride=1 pad=1 activation=leaky
[maxpool] size=2 stride=2
[convolutional] batch_normalize=1 filters=32 size=3 stride=1 pad=1 activation=leaky
[maxpool] size=2 stride=2
[convolutional] batch_normalize=1 filters=64 size=3 stride=1 pad=1 activation=leaky
[maxpool] size=2 stride=2
[convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=leaky
[maxpool] size=2 stride=2
[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky
[maxpool] size=2 stride=2
[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky
[maxpool] size=2 stride=1
[convolutional] batch_normalize=1 filters=1024 size=3 stride=1 pad=1 activation=leaky
###########
[convolutional] batch_normalize=1 size=3 stride=1 pad=1 filters=1024 activation=leaky
[convolutional] size=1 stride=1 pad=1 filters=30 activation=linear
[region] anchors = 0.738768,0.874946, 2.42204,2.65704, 4.30971,7.04493, 10.246,4.59428, 12.6868,11.8741 bias_match=1 classes=1
coords=4 num=5 softmax=1 jitter=.2 rescore=1
object_scale=5 noobject_scale=1 class_scale=1 coord_scale=1
absolute=1 thresh = .6 random=1`