AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.75k stars 7.96k forks source link

Make a medium version of yolov3 #4009

Open Msmhasani opened 5 years ago

Msmhasani commented 5 years ago

Hi, I tried yolov3 and it works perfectly for my data, but it takes much time. Tiny-yolo on the other hand is fast enough for me but the detection is poor, so I tried to make a model with a performance between these two. I did it by removing some convolutional layers from yolov3 and I minded the match between the downsampling and the upsampling layers, but I still get this error after the first [yolo] layer:

...
  44 conv     42       1 x 1/ 1      8 x   2 x1024 ->    8 x   2 x  42 0.001 BF
  45 yolo
[yolo] params: iou loss: mse, iou_norm: 0.75, cls_norm: 1.00, scale_x_y: 1.00
Unused field: 'flip = 0'
  46 route  42
  47 conv    256       1 x 1/ 1      8 x   2 x 512 ->    8 x   2 x 256 0.004 BF
  48 upsample                 2x     8 x   2 x 256 ->   16 x   4 x 256
  49 route  48 61
  50 Layer before convolutional layer must output image.: Cannot allocate memory
darknet: ./src/utils.c:293: error: Assertion `0' failed.
Aborted (core dumped)

I stuck at the route layer which it's input has the exact size of this layer when I run yolov3 (which works), but I still get the error. What do you think the problem is? Thanks.

AlexeyAB commented 5 years ago

49 route 48 61

You can route only to the previous layers, so you can't route to layer-61.

Also you can try to use this Tiny model, which is better than yolov3-tiny: https://github.com/AlexeyAB/darknet/files/3580764/yolo_v3_tiny_pan3_aa_ae_mixup_scale_giou.cfg.txt

or some of these cfg-models: https://github.com/AlexeyAB/darknet/issues/3114#issuecomment-494148968

Msmhasani commented 5 years ago

49 route 48 61

You can route only to the previous layers, so you can't route to layer-61.

Also you can try to use this Tiny model, which is better than yolov3-tiny: https://github.com/AlexeyAB/darknet/files/3580764/yolo_v3_tiny_pan3_aa_ae_mixup_scale_giou.cfg.txt

or some of these cfg-models: #3114 (comment)

@AlexeyAB Thank you for the response. I will try other cfgs. I also changed the route input layer and that error got away but at the end of the last [yolo] layer, I got another error:

> [yolo] params: iou loss: mse, iou_norm: 0.75, cls_norm: 1.00, scale_x_y: 1.00
> Unused field: 'flip = 0'
> Unused field: 'bias_match = 0'
> Total BFLOPS 4.883 
>  Allocate additional workspace_size = 1.18 MB 
> Learning Rate: 0.01, Momentum: 0.9, Decay: 0.1
> Loaded: 0.013103 seconds
> Floating point exception (core dumped) 

Am I still wrong with the route layers?

AlexeyAB commented 5 years ago

No.

Did you compile Darknet with GPU=1?

Also you used incorrect params

Unused field: 'flip = 0' Unused field: 'bias_match = 0'

Msmhasani commented 5 years ago

No.

Did you compile Darknet with GPU=1?

Also you used incorrect params

Unused field: 'flip = 0' Unused field: 'bias_match = 0'

Yes and It works for other configs with gpu. Why are those params incorrect? I removed them and still the same result.

LukeAI commented 5 years ago

Did you try running full sized yolo at a lower input image size or yolo-tiny at higher resolution? Also, I've found that the pan2-tiny model in the experimental cfgs linked above is a good medium point.

jamessmith90 commented 5 years ago

@Msmhasani Does not make sense to seek a medium version. Increase resolution of Yolov3-Tiny for more accuracy and lower resolution of Yolov3 to reduce gpu usage.

MSJawad commented 5 years ago

Hey Alexey, what command do we use for https://github.com/AlexeyAB/darknet/files/3580764/yolo_v3_tiny_pan3_aa_ae_mixup_scale_giou.cfg.txt to retrain the anchors?

AlexeyAB commented 5 years ago

Dont change anchors

MSJawad commented 5 years ago

I thought it was good practice to change anchors every time you add something to the dataset when you have a custom dataset

MSJawad commented 5 years ago

Also do we still apply the filters = (classes+5)*3 formula here to the conv layers before the yolo detection layer? it looks a little messed up. you have different number of filters here but the same number of classes(1). are you using a different formula or smth?

AlexeyAB commented 5 years ago

filters = (classes+5)*number_of_masks

so values of masks= are different in different [yolo] layers

MSJawad commented 5 years ago

Okay thanks! Also what weights do we use for this model?

AlexeyAB commented 5 years ago

The same as for common tiny model

Msmhasani commented 5 years ago

@LukeAI @jamessmith90 Thanks for the recommendation, I will do that too. Although my first goal from changing the model was understanding it better and deeper. From my experiences with other deep learning models, I believe "making the model smaller" and "making the image bigger" do not always have the same result, so I wanted to try both methods.

LukeAI commented 5 years ago

@LukeAI @jamessmith90 Thanks for the recommendation, I will do that too. Although my first goal from changing the model was understanding it better and deeper.

That's cool, I'll be interested to try your model if you share it. If I were trying to make a new model, I'd be inclined to start with yolov3-spp, keep the same depth / layer structures but make it slimmer by reducing the number of filters in the middle layers. Some people have been trying to speed up yolov3 by training a full sized model and then pruning weaker weights but the end result is supposedly essentially the same. Let us know if you try it!

Msmhasani commented 5 years ago

@LukeAI I'll try the spp version too. I forgot to mention that the original size of my images are small (say 50*200 pixels) so I don't think up-sizing them would give more information to the model, because new pixels are just interpolation of the original ones. This is the model I tried by the way, for the first round I just commented some layers of yolov3:

[net]
# Testing
#batch=1
#subdivisions=1
# Training
batch=128
subdivisions=32
width=256
height=64
channels=1
momentum=0.9
decay=0.1
angle=0
saturation = 1.5
exposure = 1.5
hue= 0.1

bias_match = 0
learning_rate=0.01
burn_in=1000
max_batches = 5000000
policy=steps
steps=4000,4500
scales=.1,.1

#[convolutional]
#batch_normalize=1
#filters=32
#size=3
#stride=1
#pad=1
#activation=leaky

## Downsample

#[convolutional]
#batch_normalize=1
#filters=64
#size=3
#stride=2
#pad=1
#activation=leaky

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=3
stride=2
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

# Downsample

[convolutional]
batch_normalize=1
filters=128
size=3
stride=2
pad=1
activation=leaky

#[convolutional]
#batch_normalize=1
#filters=64
#size=1
#stride=1
#pad=1
#activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

#[convolutional]
#batch_normalize=1
#filters=64
#size=1
#stride=1
#pad=1
#activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

# Downsample

[convolutional]
batch_normalize=1
filters=256
size=3
stride=2
pad=1
activation=leaky

#[convolutional]
#batch_normalize=1
#filters=128
#size=1
#stride=1
#pad=1
#activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

#[convolutional]
#batch_normalize=1
#filters=128
#size=1
#stride=1
#pad=1
#activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

#[convolutional]
#batch_normalize=1
#filters=128
#size=1
#stride=1
#pad=1
#activation=leaky

#[convolutional]
#batch_normalize=1
#filters=256
#size=3
#stride=1
#pad=1
#activation=leaky

#[shortcut]
#from=-3
#activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

#[convolutional]
#batch_normalize=1
#filters=128
#size=1
#stride=1
#pad=1
#activation=leaky

#[convolutional]
#batch_normalize=1
#filters=256
#size=3
#stride=1
#pad=1
#activation=leaky

#[shortcut]
#from=-3
#activation=linear

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

#[convolutional]
#batch_normalize=1
#filters=128
#size=1
#stride=1
#pad=1
#activation=leaky

#[convolutional]
#batch_normalize=1
#filters=256
#size=3
#stride=1
#pad=1
#activation=leaky

#[shortcut]
#from=-3
#activation=linear

# Downsample

[convolutional]
batch_normalize=1
filters=512
size=3
stride=2
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

#[convolutional]
#batch_normalize=1
#filters=256
#size=1
#stride=1
#pad=1
#activation=leaky

#[convolutional]
#batch_normalize=1
#filters=512
#size=3
#stride=1
#pad=1
#activation=leaky

#[shortcut]
#from=-3
#activation=linear

#[convolutional]
#batch_normalize=1
#filters=256
#size=1
#stride=1
#pad=1
#activation=leaky

#[convolutional]
#batch_normalize=1
#filters=512
#size=3
#stride=1
#pad=1
#activation=leaky

#[shortcut]
#from=-3
#activation=linear

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

#[convolutional]
#batch_normalize=1
#filters=256
#size=1
#stride=1
#pad=1
#activation=leaky

#[convolutional]
#batch_normalize=1
#filters=512
#size=3
#stride=1
#pad=1
#activation=leaky

#[shortcut]
#from=-3
#activation=linear

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

#[convolutional]
#batch_normalize=1
#filters=256
#size=1
#stride=1
#pad=1
#activation=leaky

#[convolutional]
#batch_normalize=1
#filters=512
#size=3
#stride=1
#pad=1
#activation=leaky

#[shortcut]
#from=-3
#activation=linear

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

#[convolutional]
#batch_normalize=1
#filters=512
#size=3
#stride=1
#pad=1
#activation=leaky

#[shortcut]
#from=-3
#activation=linear

# Downsample

#[convolutional]
#batch_normalize=1
#filters=1024
#size=3
#stride=2
#pad=1
#activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

#[convolutional]
#batch_normalize=1
#filters=512
#size=1
#stride=1
#pad=1
#activation=leaky

#[convolutional]
#batch_normalize=1
#filters=1024
#size=3
#stride=1
#pad=1
#activation=leaky

#[shortcut]
#from=-3
#activation=linear

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=3
stride=2
pad=1
activation=leaky

[shortcut]
from=-3
activation=linear

######################

#[convolutional]
#batch_normalize=1
#filters=512
#size=1
#stride=1
#pad=1
#activation=leaky

#[convolutional]
#batch_normalize=1
#size=3
#stride=1
#pad=1
#filters=1024
#activation=leaky

#[convolutional]
#batch_normalize=1
#filters=512
#size=1
#stride=1
#pad=1
#activation=leaky

#[convolutional]
#batch_normalize=1
#size=3
#stride=1
#pad=1
#filters=1024
#activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky

[convolutional]
size=1
stride=1
pad=1
filters=42
activation=linear

[yolo]
mask = 12,13,14,15,16,17
anchors = 12, 18,  13, 24,  16, 27,  22, 24,  20, 32,  29, 30,  20, 46,  25, 39,  24, 47,  23, 57,  29, 46,  27, 53,  31, 54,  29, 60,  37, 49,  44, 42,  35, 59,  47, 57
classes=2
num=18
jitter=.0
flip=0
ignore_thresh = .7
truth_thresh = 1
random=0

[route]
layers = -4

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[upsample]
stride=2

[route]
layers = -1, 35

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=leaky

#[convolutional]
#batch_normalize=1
#filters=256
#size=1
#stride=1
#pad=1
#activation=leaky

#[convolutional]
#batch_normalize=1
#size=3
#stride=1
#pad=1
#filters=512
#activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=leaky

[convolutional]
size=1
stride=1
pad=1
filters=42
activation=linear

[yolo]
mask = 6,7,8,9,10,11
anchors =  12, 18,  13, 24,  16, 27,  22, 24,  20, 32,  29, 30,  20, 46,  25, 39,  24, 47,  23, 57,  29, 46,  27, 53,  31, 54,  29, 60,  37, 49,  44, 42,  35, 59,  47, 57
classes=2
num=18
jitter=.0
flip=0
ignore_thresh = .7
truth_thresh = 1
random=0
bias_match = 0

[route]
layers = -4

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[upsample]
stride=2

[route]
layers = -1,19

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=256
activation=leaky

#[convolutional]
#batch_normalize=1
#filters=128
#size=1
#stride=1
#pad=1
#activation=leaky

#[convolutional]
#batch_normalize=1
#size=3
#stride=1
#pad=1
#filters=256
#activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=256
activation=leaky

[convolutional]
size=1
stride=1
pad=1
filters=42
activation=linear

[yolo]
mask = 0,1,2,3,4,5
anchors =  12, 18,  13, 24,  16, 27,  22, 24,  20, 32,  29, 30,  20, 46,  25, 39,  24, 47,  23, 57,  29, 46,  27, 53,  31, 54,  29, 60,  37, 49,  44, 42,  35, 59,  47, 57
classes=2
num=18
jitter=.0
ignore_thresh = .7
truth_thresh = 1
random=0
shahla-ai commented 4 years ago

49 route 48 61

You can route only to the previous layers, so you can't route to layer-61.

Also you can try to use this Tiny model, which is better than yolov3-tiny: https://github.com/AlexeyAB/darknet/files/3580764/yolo_v3_tiny_pan3_aa_ae_mixup_scale_giou.cfg.txt

or some of these cfg-models: #3114 (comment)

hello Can I use this cfg for custom object tracking? Iam following the method provided in the read me for custom object tracking. I have another issue my dataset is 1300 photo and the max batch recommended in the read me is class number * 2000. I used 1000 as a max batch size since the training data is 1000 pic and the rest of the data is for testing. is that ok and is the dataset enough I only have one class