pjreddie / darknet

Convolutional Neural Networks
http://pjreddie.com/darknet/
Other
25.82k stars 21.33k forks source link

How to train tiny YOLO? #517

Open smitshilu opened 6 years ago

smitshilu commented 6 years ago

Hello, I want to train tiny YOLO on my own dataset. How can I train it?

Thank you,

AlexeyAB commented 6 years ago

Hi,

dexception commented 6 years ago

@AlexeyAB Can you tell me on how many images yolov2-tiny model was trained ?

chinmay5 commented 6 years ago

@AlexeyAB for training the tiny-yolo model having 42 classes, how many iterations are expected? The conversion rate for tiny Yolo is pretty slow. Should I consider trying to change few hyper-parameters? yolo-tiny-obj.cfg.txt I am attaching the config file here. I just updated the filter size and am able to start the training.

I am also attaching portion of the output for the training phase. I think the model is taking really long to train because the "non-tiny" version of YOLO had started giving me good result in only 1500 iterations while here I am through 18K iterations and still struggling image

AlexeyAB commented 6 years ago

for training the tiny-yolo model having 42 classes, how many iterations are expected?

You should train about 42 x 2000 ~= 84 000 iterations (if you have about 84 000 images)

Should I consider trying to change few hyper-parameters?

It depends on what kind of object do you try to detect.

In general, better to use yolov3-tiny.cfg instead of yolov2-tiny.cfg it has ~the same performance (speed) but much more accuracy. How to train yolov3-tiny: https://github.com/AlexeyAB/darknet#how-to-train-tiny-yolo-to-detect-your-custom-objects

chinmay5 commented 6 years ago

Hi @AlexeyAB , my only problem is, I have around 600 distinct annotated training images. This is the number that was provided in the competition as well so not sure if I should try running so many iterations with the same data size

AlexeyAB commented 6 years ago

@chinmay5 Yes, for 600 images, you should train about 4000 - 8000 iterations. And more if you increased data augmentation parameters: jitter, hue, saturation, exposure

chinmay5 commented 6 years ago

@AlexeyAB I am able to perform the training on the model and thank you so much for that. Next, I want to convert this tiny-yolo-v3 model into a coreml model so that I can use it on the iPhone. Can someone direct me to some references. More importantly, is there a way for a direct conversion somehow?

chinmay5 commented 6 years ago

@AlexeyAB is there a way to obtain graphs or other measure to figure out which model shall I use. I see that error values are generated but I did not get a graph where they had been plotted. I tried with the IoU value but I really want to know if it was the best model and whether I can improve. I have a separate test set of some 300 images as well

AlexeyAB commented 6 years ago

@chinmay5 Try to compare mAP of your models: https://github.com/AlexeyAB/darknet#when-should-i-stop-training Or just take the latest weights-file with the lowest training avg loss.

keides2 commented 6 years ago

Hi @AlexeyAB , I trained with yolov3-tiny_obj.cfg and yolov3-tiny.weights with the following command. $ ./darknet partial cfg / yolov3-tiny.cfg yolov3-tiny.weights yolov3-tiny.conv.15 15 $ ./darknet detector train data / obj.data yolov3-tiny_obj.cfg yolov3-tiny.conv.15

I want to convert the generated .weights file to pb file using darkflow's following command and run it on android mobile phone, but I got an error. (I will add the details of the error tomorrow) $ ./flow --module cfg/yolov3-tiny_obj.cfg --load bin/yolov3-tiny_obj-10000.weights --savepb

Can darkflow convert yolov3-tiny_obj_10000.weights to pb file?

If possible, can I use this pb file with android-yolo-v2-master?

Addition: (tensorflow) [shimatani@bslpc168 ~/darkflow]$./flow --model cfg/yolov3-tiny_obj.cfg --load bin/yolov 3-tiny_obj_10000.weights --savepb

/home/shimatani/darkflow/darkflow/dark/darknet.py:54: UserWarning: ./cfg/yolov3-tiny_obj_10000.cfg not found, use cfg/yolov3-tiny_obj.cfg instead cfg_path, FLAGS.model)) Parsing cfg/yolov3-tiny_obj.cfg Layer [yolo] not implemented

Correction: If possible, can I use this pb file with android-yolo-v2 of szaza?

AlexeyAB commented 6 years ago

@keides2

As I see https://github.com/szaza/android-yolo-v2 and https://github.com/thtrieu/darkflow don't support Yolo v3. It supports only Yolo v2.

keides2 commented 6 years ago

I got it. Thank you so much, @AlexeyAB .

Sweta02018 commented 6 years ago

I am trying to train yolov3 with darknet53 on 12289 images on GPU with 2GB space. I get cuda out of memory error. I have changed all possible values of batch and subdivision still I am getting the same error after some 80th iterations. Can anyone help me to know about what should be the sufficient space of GPU for Darknet53? Thanks in advance!

chinmay5 commented 6 years ago

@AlexeyAB correct me if I am wrong, but I think the OOM is dependent upon the resolution and size of the images. Although I would say, that 2 GB memory is way too less. In my opinion, 4GB should be the minimum starting point

AlexeyAB commented 6 years ago

@chinmay5

sleebapaul commented 6 years ago

@chinmay5 Have a look at here.

123alaa commented 6 years ago

@AlexeyAB Hey, I'm trying to train yolov3 (reproduce your training) on coco dataset, so i have these configuration correct me if something wrong:

  1. i used batch_size = 64 subdivision=16
  2. i added also the images that don't have annotations (you say that it is help)
  3. i started the learning rate 0.001 but i see that the is decays, but it is not changes in the iterations
  4. the image size is 608 with the same anchors for 416, is it ok? i try to do this three weeks ago, with various config, the problem that the training avg loss stuck on ~5.00 after 10000 iteration, this is valid for convergence??

Can you help. Best

Malouke commented 6 years ago

hi, i am confuse you see OOM not depand on size of your memory but about image size ? what about resizing in top of code because all input images don't have the same shape? No? please give us more details because actully i dont undertstand

i have arround 1000 images without any kind of data augmentation it just original ones (balls colors billards) if i should to do data augmenatation tell me .

visshvesh commented 6 years ago

I got it. Thank you so much, @AlexeyAB .

@keides2 How did you manage to make .pb file out of .weights file for yolov3 in darknet?

BaijuMishra commented 5 years ago

@AlexeyAB , I'm trying to train yolov2 (reproduce your training) on custom dataset(class2), Actual Image Size is 1106*620 (6000 Images)

in config - 416*416 batch_size = 64 subdivision=8 learning rate 0.01 the problem is that avg loss stuck on ~1.00 after 5000 iteration,

Kindly help me out

AlexeyAB commented 5 years ago

@BaijuMishra It is normal. Train 5000 - 8000 iterations, then check mAP by using this repository: https://github.com/AlexeyAB/darknet#when-should-i-stop-training

BaijuMishra commented 5 years ago

@AlexeyAB Thank you for the Response.. :), Definitely I will check for mAP with Higher Iteration but before proceeding, I have some confusion on few parameters- 1- I have a image with dimension 1106 x 620 , Can I use 960x618(which is closer to my image dimension) in config file as width and Height ? 2- When I set Random = 1 , I am getting cuda malloc error .

Please guide me on the same .

and do we have any option to add new Objects using transfer learning from existing Darknet Model .?

Regards, Baiju

BaijuMishra commented 5 years ago

@AlexeyAB ,In Yolov3,. Please let me know ,How to save .Weights file into .pb and .cfg file to .metadata ?

Thank you ,

Regards, Baiju

taojin1992 commented 5 years ago

@chinmay5 Yes, for 600 images, you should train about 4000 - 8000 iterations. And more if you increased data augmentation parameters: jitter, hue, saturation, exposure

@AlexeyAB What is your rationale to predetermine this range of potential iteration number? Thank you!

FlavorDots commented 4 years ago

Hello :) How can I train tiny YOLOv3 and install it in Android? @AlexeyAB

abdalkhalik commented 4 years ago

@AlainMindana

I tried this on yolov4 and it worked, i'm pretty sure its the same for u r requirments Get the .weight file then use this repo https://github.com/hunglc007/tensorflow-yolov4-tflite (which has both a conversion and an android project) to: 1- convert .weights -> .tflite (with the flag "--quantize_mode float16") 2- replace the tflite file in android project with your generated tflite file 3- go to download_model.gradle and remove the code which download a predefined tflite model

Razamalik4497 commented 3 years ago

@AlexeyAB YOlov4 tiny is not detect classes Not creating bounding boxes on video , While I have trained on 6000 iteration, I have configure my cfg file, I have 3 different classes and small dataset 100 images total ,

1- if this is Low dataset problem why model is good working on images ?

detector : !./darknet detector demo data/obj.data cfg/yolov4-custom.cfg /content/darknet/backup/custom-yolov4-tiny-detector_best.weights /content/drive/MyDrive/2.mp4 -thresh 0.9 -out_filename /content/drive/MyDrive/RR.mp4

CFG FILE :
[net]
# Testing
#batch=1
#subdivisions=1
# Training
batch=64
subdivisions=16
width=416
height=416
channels=3
momentum=0.949
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1

learning_rate=0.001
burn_in=1000
max_batches = 6000
policy=steps
steps=4800, 5400
scales=.1,.1

#cutmix=1
mosaic=1

#:104x104 54:52x52 85:26x26 104:13x13 for 416

[convolutional]
batch_normalize=1
filters=32
size=3
stride=1
pad=1
activation=mish

# Downsample

[convolutional]
batch_normalize=1
filters=64
size=3
stride=2
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=mish

[route]
layers = -2

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=mish

[route]
layers = -1,-7

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=mish

# Downsample

[convolutional]
batch_normalize=1
filters=128
size=3
stride=2
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=mish

[route]
layers = -2

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=mish

[route]
layers = -1,-10

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=mish

# Downsample

[convolutional]
batch_normalize=1
filters=256
size=3
stride=2
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=mish

[route]
layers = -2

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=mish

[route]
layers = -1,-28

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=mish

# Downsample

[convolutional]
batch_normalize=1
filters=512
size=3
stride=2
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=mish

[route]
layers = -2

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=mish

[route]
layers = -1,-28

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=mish

# Downsample

[convolutional]
batch_normalize=1
filters=1024
size=3
stride=2
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=mish

[route]
layers = -2

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=mish

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=mish

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=mish

[route]
layers = -1,-16

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=mish

##########################

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

### SPP ###
[maxpool]
stride=1
size=5

[route]
layers=-2

[maxpool]
stride=1
size=9

[route]
layers=-4

[maxpool]
stride=1
size=13

[route]
layers=-1,-3,-5,-6
### End SPP ###

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[upsample]
stride=2

[route]
layers = 85

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[route]
layers = -1, -3

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[upsample]
stride=2

[route]
layers = 54

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[route]
layers = -1, -3

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=256
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=256
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

##########################

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=256
activation=leaky

[convolutional]
size=1
stride=1
pad=1
filters=24
activation=linear

[yolo]
mask = 0,1,2
anchors = 12, 16, 19, 36, 40, 28, 36, 75, 76, 55, 72, 146, 142, 110, 192, 243, 459, 401
classes=3
num=9
jitter=.3
ignore_thresh = .7
truth_thresh = 1
scale_x_y = 1.2
iou_thresh=0.213
cls_normalizer=1.0
iou_normalizer=0.07
iou_loss=ciou
nms_kind=greedynms
beta_nms=0.6
max_delta=5

[route]
layers = -4

[convolutional]
batch_normalize=1
size=3
stride=2
pad=1
filters=256
activation=leaky

[route]
layers = -1, -16

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=leaky

[convolutional]
size=1
stride=1
pad=1
filters=24
activation=linear

[yolo]
mask = 3,4,5
anchors = 12, 16, 19, 36, 40, 28, 36, 75, 76, 55, 72, 146, 142, 110, 192, 243, 459, 401
classes=3
num=9
jitter=.3
ignore_thresh = .7
truth_thresh = 1
scale_x_y = 1.1
iou_thresh=0.213
cls_normalizer=1.0
iou_normalizer=0.07
iou_loss=ciou
nms_kind=greedynms
beta_nms=0.6
max_delta=5

[route]
layers = -4

[convolutional]
batch_normalize=1
size=3
stride=2
pad=1
filters=24
activation=leaky

[route]
layers = -1, -37

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky

[convolutional]
size=1
stride=1
pad=1
filters=24
activation=linear

[yolo]
mask = 6,7,8
anchors = 12, 16, 19, 36, 40, 28, 36, 75, 76, 55, 72, 146, 142, 110, 192, 243, 459, 401
classes=3
num=9
jitter=.3
ignore_thresh = .7
truth_thresh = 1
random=1
scale_x_y = 1.05
iou_thresh=0.213
cls_normalizer=1.0
iou_normalizer=0.07
iou_loss=ciou
nms_kind=greedynms
beta_nms=0.6
max_delta=5
AlexeyAB commented 3 years ago

@Razamalik4497 You should use this code for Training and Detection https://github.com/AlexeyAB/darknet

Could you show both command for detection on Images and Video?