Is it possible to modify the YOLO network and train from scratch?

AurusHuang commented 6 years ago

I have an idea of adapting YOLO network to some other existing classification networks like SqueezeNet or ResNet to tackle different problems. Is it possible to create such network and train without pretrained weights file? What specific layers or parameters should I be aware of?

Li-Lai commented 6 years ago

These networks have been implemented by the author...https://pjreddie.com/darknet/ Of course, you can also design your own network and start from scratch.

AurusHuang commented 6 years ago

Thank you for your answer. However, I need some details in order to make such a network. Do you have any tutorials or outlines about how to adapt a classification network into YOLO (I'm not saying Darknet, because Darknet is a CNN framework)? I suppose the original YOLO network was modified from GoogleNet or something, and a few special layers were added. Also, all the training tutorials I got need a pretrained model(for example, darknet19_448.conv.23 provided by the author). I suppose they're network-specific, not universal. What should I do if I can't get such pretrained models?

Li-Lai commented 6 years ago

I think you might ignore the link I gave you. Please see this link... https://pjreddie.com/darknet/imagenet/ Yolo detection network was modified from classification network. ( Author has complete AlexNet,Darknet Reference, VGG-16, Extraction, Darknet19,Darknet19 448x448, Resnet,Densenet,squeezenet). You can use the above classifier network to replace the base network of yolo. The author also mentioned it in the paper.

AurusHuang commented 6 years ago

If I designed my own network (with region layer), can I run training by just typing .\darknet detector train %datafile.data% %cfgfile.cfg% with %datafile.data% and %cfgfile.cfg% replaced with actual filenames? If not, what should I do before training?

Li-Lai commented 6 years ago

yes, you can.

AurusHuang commented 6 years ago

But on my machine, I can't. Darknet will terminate itself even before it loads the network.

Li-Lai commented 6 years ago

If you can, you can send your CFG configuration file to me, and I'll check it. email: nuist_lilai@foxmail.com

AurusHuang commented 6 years ago

Well...it's probably not cfg's problem. Because it also crashes when using YOLO's original cfgs.

Li-Lai commented 6 years ago

@AurusHuang post your error here. Firstly, check your system environment configuration. Secondly, check the installation of Darknet.

AurusHuang commented 6 years ago

Well...the debugger has located the problem here:

char *weights = (argc > 5) ? argv[5] : 0;
if (weights[strlen(weights) - 1] == 0x0d) weights[strlen(weights) - 1] = 0;

with weights set to null pointer, it's impossible to access weights[strlen(weights) - 1].

Li-Lai commented 6 years ago

post your input command here.

AurusHuang commented 6 years ago

.\darknet detector train cfg\voc07.data tiny-yolo-voc07.cfg will result a program termination. .\darknet detector train cfg\voc07.data tiny-yolo-voc07.cfg darknet19_448.conv.23 will work.

Li-Lai commented 6 years ago

sorry. In commit 330 & commit 390 src files, I haven't found your code[if (weights[strlen(weights) - 1] == 0x0d) weights[strlen(weights) - 1] = 0;]. I use linux plateform not windows.

AurusHuang commented 6 years ago

Yes. I'm using @AlexeyAB 's Darknet Windows repository. I'll checkout the latest version to see if it's modified.

abhigoku10 commented 6 years ago

@CBIR-LL can you pls give the squeeze net .cfg file and what is the size of the weight file generated ??

Li-Lai commented 6 years ago

https://pjreddie.com/darknet/ You can find squeezenet completed by darknet in the above link page.

abhigoku10 commented 6 years ago

@CBIR-LL Thnkx i was able to find the file. i have few question. 1.Currently i am using tiny-yolo-voc.cfg which generates a model of size 63MB by using squeeze net would the model size decrease ?? 2.Or can u suggest any other .cfg file which i can use to reduce my model size

Li-Lai commented 6 years ago

question 1: I haven't test it. question 2: change network structure; use depthwise convolution; prune, trained quantization and Huffman coding.

vowstar commented 6 years ago

Yes, I modified the modle and train success.

abhigoku10 commented 6 years ago

@vowstar can u pls share the .cfg file and the info of the size of the model file generated

groot-1313 commented 6 years ago

@AurusHuang were you able to train a custom network without a pretrained weights file?

arun-kumark commented 6 years ago

Hello, I am trying to train the Darknet for Resnet152 configuration. I have 9-object classes each with 1000 images approx. I am using the configuration file provided in Darknet/cfg folder. Before training I did the required changes in following files according to my needs: cfvoc.data (Contains the following information: ) classes= 9 train = data/cfdata/cftrain.txt valid = data/cfdata/cfval.txt names = data/cfvoc.names backup = backup

Also, I changed the cfg and demo.c files also created the corresponding txt files required for Darknet on Pascal VOC dataset. txt files looks as follows:

1 0.408203125 0.30972222222222223 0.31640625 0.29444444444444445

The model is trained and converged well for max_batches 12000, and I could see loss value 0.06. But I am not able to run the detection using the same configuration (resnet152.cfg).

Before starting the detection the changed back the cfg for the batch=128 to 1 and subdivisions=8 also to 1.

The camera don't start but throws the assertion error.

I debugged the issue little further, and found, that it's due that there is no detection layer, and assertion is due to the cost layer present in the ResnetCFG file (I am surprised, should it be?, when there is already softmax layer in the network) and due to the

[convolutional]
filters=70
size=1
stride=1
pad=1
activation=linear

[avgpool]

[softmax]
groups=1

[cost]
type=sse

The code is crashing. So I tried to place the detection layer (Which I feel shouldn't be), [detection] classes=9

Accordingly I changed the detection layer and I further tried to place the following values; side = 7, l.coords = 4, l.n = 2, l.classes = 9, inputs = 70

now it looks like:

[detection]
classes=9
coords=4
rescore=1
side=7
num=2
softmax=1
sqrt=1
jitter=.2

object_scale=1
noobject_scale=.5
class_scale=1
coord_scale=5

Now, camera is opening but there is the detection like: -

scotti: 1689434%
skipper: 1612797%
kellogs: 24614%
valfrutta: 2470%
scotti: 20403%
granrisparmio: 2731%
tonnorio: 664%
valfrutta: 6342%
scotti: 6124%
barilla: 46584%
tonnorio: 46487%
kellogs: 3031%
colgate: 38375%
valfrutta: 14603%

Please guide me, how to change the resnet152.cfg for Detection using my trained weights.

My Command to run the detection is as follows: ./darknet detector demo cfg/cfvoc.data cfg/resnet152.cfg backup/resnet152-.backup -thresh 0.86

Please help me resolving this issue.

Thank you very much.

Kind Regards Arun

groot-1313 commented 6 years ago

check out issue 391

arun-kumark commented 6 years ago

Hi Groot-1313, Thanks for the link. I checked it and tried putting the [region] layer in my resnet152.cfg file in the end. The updated CFG file looks like below:


[reorg]
stride=2

[route]
layers=-1,-4

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky

[convolutional]
size=1
stride=1
pad=1
filters=70
activation=linear

[region]
anchors =  1.3221, 1.73145, 3.19275, 4.00944, 5.05587, 8.09892, 9.47112, 4.84053, 11.2364, 10.0071
bias_match=1
classes=9
coords=4
num=5
softmax=1
jitter=.3
rescore=1

object_scale=5
noobject_scale=1
class_scale=1
coord_scale=1

absolute=1
thresh = .6
random=1

But now, no detection is happening now. I tried giving very low thresholds also.

I am not sure, where I am wrong.

Kind Regards Arun

AlexeyAB commented 6 years ago

@arun-kumark

You should download Classification weights: https://pjreddie.com/media/files/resnet152.weights
Then you should get the file resnet152.200 - pre-trained weights: ./darknet partial cfg/resnet152.cfg resnet152.weights resnet152.200 200
And then you should train your Detection network on your detection dataset of images: ./darknet detector train data/obj.data detection_resnet152.cfg resnet152.200

Where detection_resnet152.cfg is resnet152 with region layer.

arun-kumark commented 6 years ago

Hi AlexeyAB, 1.) I downloaded the weights, and followed the point2.) mentioned.

The file resnet152.200 is generated. I observed the layers at the time of generation of this file, below is the output:

arun@arun:~/darknet/darknet-master$ ./darknet partial cfg/resnet152.cfg weights/resnet152.weights resnet152.200 200

layer     filters    size              input                output
    0 conv     32  3 x 3 / 1   416 x 416 x   3   ->   416 x 416 x  32
    1 max          2 x 2 / 2   416 x 416 x  32   ->   208 x 208 x  32
    2 conv     64  3 x 3 / 1   208 x 208 x  32   ->   208 x 208 x  64
    3 max          2 x 2 / 2   208 x 208 x  64   ->   104 x 104 x  64
    4 conv    128  3 x 3 / 1   104 x 104 x  64   ->   104 x 104 x 128
    5 conv     64  1 x 1 / 1   104 x 104 x 128   ->   104 x 104 x  64
    6 conv    128  3 x 3 / 1   104 x 104 x  64   ->   104 x 104 x 128
    7 max          2 x 2 / 2   104 x 104 x 128   ->    52 x  52 x 128
    8 conv    256  3 x 3 / 1    52 x  52 x 128   ->    52 x  52 x 256
    9 conv    128  1 x 1 / 1    52 x  52 x 256   ->    52 x  52 x 128
   10 conv    256  3 x 3 / 1    52 x  52 x 128   ->    52 x  52 x 256
   11 max          2 x 2 / 2    52 x  52 x 256   ->    26 x  26 x 256
   12 conv    512  3 x 3 / 1    26 x  26 x 256   ->    26 x  26 x 512
   13 conv    256  1 x 1 / 1    26 x  26 x 512   ->    26 x  26 x 256
   14 conv    512  3 x 3 / 1    26 x  26 x 256   ->    26 x  26 x 512
   15 conv    256  1 x 1 / 1    26 x  26 x 512   ->    26 x  26 x 256
   16 conv    512  3 x 3 / 1    26 x  26 x 256   ->    26 x  26 x 512
   17 max          2 x 2 / 2    26 x  26 x 512   ->    13 x  13 x 512
   18 conv   1024  3 x 3 / 1    13 x  13 x 512   ->    13 x  13 x1024
   19 conv    512  1 x 1 / 1    13 x  13 x1024   ->    13 x  13 x 512
   20 conv   1024  3 x 3 / 1    13 x  13 x 512   ->    13 x  13 x1024
   21 conv    512  1 x 1 / 1    13 x  13 x1024   ->    13 x  13 x 512
   22 conv   1024  3 x 3 / 1    13 x  13 x 512   ->    13 x  13 x1024
   23 conv   1024  3 x 3 / 1    13 x  13 x1024   ->    13 x  13 x1024
   24 conv   1024  3 x 3 / 1    13 x  13 x1024   ->    13 x  13 x1024
   25 route  16
   26 conv     64  1 x 1 / 1    26 x  26 x 512   ->    26 x  26 x  64
   27 reorg              / 2    26 x  26 x  64   ->    13 x  13 x 256
   28 route  27 24
   29 conv   1024  3 x 3 / 1    13 x  13 x1280   ->    13 x  13 x1024
   30 conv     70  1 x 1 / 1    13 x  13 x1024   ->    13 x  13 x  70
   31 detection

mask_scale: Using default '1.000000' Loading weights from weights/resnet152.weights...Done! Saving weights to resnet152.200 arun@arun:~/darknet/darknet-master$

It doens't look like the architecture of Resnet152. As it should have more deeper network. like below:

layer     filters    size              input                output
    0 conv     64  7 x 7 / 2   256 x 256 x   3   ->   128 x 128 x  64
    1 max          2 x 2 / 2   128 x 128 x  64   ->    64 x  64 x  64
    2 conv     64  1 x 1 / 1    64 x  64 x  64   ->    64 x  64 x  64
    3 conv     64  3 x 3 / 1    64 x  64 x  64   ->    64 x  64 x  64
    4 conv    256  1 x 1 / 1    64 x  64 x  64   ->    64 x  64 x 256
    5 Shortcut Layer: 1
    6 conv     64  1 x 1 / 1    64 x  64 x 256   ->    64 x  64 x  64
    7 conv     64  3 x 3 / 1    64 x  64 x  64   ->    64 x  64 x  64
    8 conv    256  1 x 1 / 1    64 x  64 x  64   ->    64 x  64 x 256
    9 Shortcut Layer: 5
   10 conv     64  1 x 1 / 1    64 x  64 x 256   ->    64 x  64 x  64
   11 conv     64  3 x 3 / 1    64 x  64 x  64   ->    64 x  64 x  64
   12 conv    256  1 x 1 / 1    64 x  64 x  64   ->    64 x  64 x 256
   13 Shortcut Layer: 9
   14 conv    128  1 x 1 / 1    64 x  64 x 256   ->    64 x  64 x 128
   15 conv    128  3 x 3 / 2    64 x  64 x 128   ->    32 x  32 x 128
   16 conv    512  1 x 1 / 1    32 x  32 x 128   ->    32 x  32 x 512
   17 Shortcut Layer: 13
   18 conv    128  1 x 1 / 1    32 x  32 x 512   ->    32 x  32 x 128
   19 conv    128  3 x 3 / 1    32 x  32 x 128   ->    32 x  32 x 128
   20 conv    512  1 x 1 / 1    32 x  32 x 128   ->    32 x  32 x 512
   21 Shortcut Layer: 17
   22 conv    128  1 x 1 / 1    32 x  32 x 512   ->    32 x  32 x 128
   23 conv    128  3 x 3 / 1    32 x  32 x 128   ->    32 x  32 x 128
   24 conv    512  1 x 1 / 1    32 x  32 x 128   ->    32 x  32 x 512
   25 Shortcut Layer: 21
   26 conv    128  1 x 1 / 1    32 x  32 x 512   ->    32 x  32 x 128
   27 conv    128  3 x 3 / 1    32 x  32 x 128   ->    32 x  32 x 128
   28 conv    512  1 x 1 / 1    32 x  32 x 128   ->    32 x  32 x 512
   29 Shortcut Layer: 25
   30 conv    128  1 x 1 / 1    32 x  32 x 512   ->    32 x  32 x 128
   31 conv    128  3 x 3 / 1    32 x  32 x 128   ->    32 x  32 x 128
   32 conv    512  1 x 1 / 1    32 x  32 x 128   ->    32 x  32 x 512
   33 Shortcut Layer: 29
   34 conv    128  1 x 1 / 1    32 x  32 x 512   ->    32 x  32 x 128
   35 conv    128  3 x 3 / 1    32 x  32 x 128   ->    32 x  32 x 128
   36 conv    512  1 x 1 / 1    32 x  32 x 128   ->    32 x  32 x 512
   37 Shortcut Layer: 33
   38 conv    128  1 x 1 / 1    32 x  32 x 512   ->    32 x  32 x 128
   39 conv    128  3 x 3 / 1    32 x  32 x 128   ->    32 x  32 x 128
   40 conv    512  1 x 1 / 1    32 x  32 x 128   ->    32 x  32 x 512
   41 Shortcut Layer: 37
   42 conv    128  1 x 1 / 1    32 x  32 x 512   ->    32 x  32 x 128
   43 conv    128  3 x 3 / 1    32 x  32 x 128   ->    32 x  32 x 128
   44 conv    512  1 x 1 / 1    32 x  32 x 128   ->    32 x  32 x 512
   45 Shortcut Layer: 41
   46 conv    256  1 x 1 / 1    32 x  32 x 512   ->    32 x  32 x 256
   47 conv    256  3 x 3 / 2    32 x  32 x 256   ->    16 x  16 x 256
   48 conv   1024  1 x 1 / 1    16 x  16 x 256   ->    16 x  16 x1024
   49 Shortcut Layer: 45
   50 conv    256  1 x 1 / 1    16 x  16 x1024   ->    16 x  16 x 256
   51 conv    256  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 256
   52 conv   1024  1 x 1 / 1    16 x  16 x 256   ->    16 x  16 x1024
   53 Shortcut Layer: 49
   54 conv    256  1 x 1 / 1    16 x  16 x1024   ->    16 x  16 x 256
   55 conv    256  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 256
   56 conv   1024  1 x 1 / 1    16 x  16 x 256   ->    16 x  16 x1024
   57 Shortcut Layer: 53
   58 conv    256  1 x 1 / 1    16 x  16 x1024   ->    16 x  16 x 256
   59 conv    256  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 256
   60 conv   1024  1 x 1 / 1    16 x  16 x 256   ->    16 x  16 x1024
   61 Shortcut Layer: 57
   62 conv    256  1 x 1 / 1    16 x  16 x1024   ->    16 x  16 x 256
   63 conv    256  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 256
   64 conv   1024  1 x 1 / 1    16 x  16 x 256   ->    16 x  16 x1024
   65 Shortcut Layer: 61
   66 conv    256  1 x 1 / 1    16 x  16 x1024   ->    16 x  16 x 256
   67 conv    256  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 256
   68 conv   1024  1 x 1 / 1    16 x  16 x 256   ->    16 x  16 x1024
   69 Shortcut Layer: 65
   70 conv    256  1 x 1 / 1    16 x  16 x1024   ->    16 x  16 x 256
   71 conv    256  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 256
   72 conv   1024  1 x 1 / 1    16 x  16 x 256   ->    16 x  16 x1024
   73 Shortcut Layer: 69
   74 conv    256  1 x 1 / 1    16 x  16 x1024   ->    16 x  16 x 256
   75 conv    256  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 256
   76 conv   1024  1 x 1 / 1    16 x  16 x 256   ->    16 x  16 x1024
   77 Shortcut Layer: 73
   78 conv    256  1 x 1 / 1    16 x  16 x1024   ->    16 x  16 x 256
   79 conv    256  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 256
   80 conv   1024  1 x 1 / 1    16 x  16 x 256   ->    16 x  16 x1024
   81 Shortcut Layer: 77
   82 conv    256  1 x 1 / 1    16 x  16 x1024   ->    16 x  16 x 256
   83 conv    256  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 256
   84 conv   1024  1 x 1 / 1    16 x  16 x 256   ->    16 x  16 x1024
   85 Shortcut Layer: 81
   86 conv    256  1 x 1 / 1    16 x  16 x1024   ->    16 x  16 x 256
   87 conv    256  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 256
   88 conv   1024  1 x 1 / 1    16 x  16 x 256   ->    16 x  16 x1024
   89 Shortcut Layer: 85
   90 conv    256  1 x 1 / 1    16 x  16 x1024   ->    16 x  16 x 256
   91 conv    256  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 256
   92 conv   1024  1 x 1 / 1    16 x  16 x 256   ->    16 x  16 x1024
   93 Shortcut Layer: 89
   94 conv    256  1 x 1 / 1    16 x  16 x1024   ->    16 x  16 x 256
   95 conv    256  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 256
   96 conv   1024  1 x 1 / 1    16 x  16 x 256   ->    16 x  16 x1024
   97 Shortcut Layer: 93
   98 conv    256  1 x 1 / 1    16 x  16 x1024   ->    16 x  16 x 256
   99 conv    256  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 256
  100 conv   1024  1 x 1 / 1    16 x  16 x 256   ->    16 x  16 x1024
  101 Shortcut Layer: 97
  102 conv    256  1 x 1 / 1    16 x  16 x1024   ->    16 x  16 x 256
  103 conv    256  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 256
  104 conv   1024  1 x 1 / 1    16 x  16 x 256   ->    16 x  16 x1024
  105 Shortcut Layer: 101
  106 conv    256  1 x 1 / 1    16 x  16 x1024   ->    16 x  16 x 256
  107 conv    256  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 256
  108 conv   1024  1 x 1 / 1    16 x  16 x 256   ->    16 x  16 x1024
  109 Shortcut Layer: 105
  110 conv    256  1 x 1 / 1    16 x  16 x1024   ->    16 x  16 x 256
  111 conv    256  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 256
  112 conv   1024  1 x 1 / 1    16 x  16 x 256   ->    16 x  16 x1024
  113 Shortcut Layer: 109
  114 conv    256  1 x 1 / 1    16 x  16 x1024   ->    16 x  16 x 256
  115 conv    256  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 256
  116 conv   1024  1 x 1 / 1    16 x  16 x 256   ->    16 x  16 x1024
  117 Shortcut Layer: 113
  118 conv    256  1 x 1 / 1    16 x  16 x1024   ->    16 x  16 x 256
  119 conv    256  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 256
  120 conv   1024  1 x 1 / 1    16 x  16 x 256   ->    16 x  16 x1024
  121 Shortcut Layer: 117
  122 conv    256  1 x 1 / 1    16 x  16 x1024   ->    16 x  16 x 256
  123 conv    256  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 256
  124 conv   1024  1 x 1 / 1    16 x  16 x 256   ->    16 x  16 x1024
  125 Shortcut Layer: 121
  126 conv    256  1 x 1 / 1    16 x  16 x1024   ->    16 x  16 x 256
  127 conv    256  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 256
  128 conv   1024  1 x 1 / 1    16 x  16 x 256   ->    16 x  16 x1024
  129 Shortcut Layer: 125
  130 conv    256  1 x 1 / 1    16 x  16 x1024   ->    16 x  16 x 256
  131 conv    256  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 256
  132 conv   1024  1 x 1 / 1    16 x  16 x 256   ->    16 x  16 x1024
  133 Shortcut Layer: 129
  134 conv    256  1 x 1 / 1    16 x  16 x1024   ->    16 x  16 x 256
  135 conv    256  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 256
  136 conv   1024  1 x 1 / 1    16 x  16 x 256   ->    16 x  16 x1024
  137 Shortcut Layer: 133
  138 conv    256  1 x 1 / 1    16 x  16 x1024   ->    16 x  16 x 256
  139 conv    256  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 256
  140 conv   1024  1 x 1 / 1    16 x  16 x 256   ->    16 x  16 x1024
  141 Shortcut Layer: 137
  142 conv    256  1 x 1 / 1    16 x  16 x1024   ->    16 x  16 x 256
  143 conv    256  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 256
  144 conv   1024  1 x 1 / 1    16 x  16 x 256   ->    16 x  16 x1024
  145 Shortcut Layer: 141
  146 conv    256  1 x 1 / 1    16 x  16 x1024   ->    16 x  16 x 256
  147 conv    256  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 256
  148 conv   1024  1 x 1 / 1    16 x  16 x 256   ->    16 x  16 x1024
  149 Shortcut Layer: 145
  150 conv    256  1 x 1 / 1    16 x  16 x1024   ->    16 x  16 x 256
  151 conv    256  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 256
  152 conv   1024  1 x 1 / 1    16 x  16 x 256   ->    16 x  16 x1024
  153 Shortcut Layer: 149
  154 conv    256  1 x 1 / 1    16 x  16 x1024   ->    16 x  16 x 256
  155 conv    256  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 256
  156 conv   1024  1 x 1 / 1    16 x  16 x 256   ->    16 x  16 x1024
  157 Shortcut Layer: 153
  158 conv    256  1 x 1 / 1    16 x  16 x1024   ->    16 x  16 x 256
  159 conv    256  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 256
  160 conv   1024  1 x 1 / 1    16 x  16 x 256   ->    16 x  16 x1024
  161 Shortcut Layer: 157
  162 conv    256  1 x 1 / 1    16 x  16 x1024   ->    16 x  16 x 256
  163 conv    256  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 256
  164 conv   1024  1 x 1 / 1    16 x  16 x 256   ->    16 x  16 x1024
  165 Shortcut Layer: 161
  166 conv    256  1 x 1 / 1    16 x  16 x1024   ->    16 x  16 x 256
  167 conv    256  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 256
  168 conv   1024  1 x 1 / 1    16 x  16 x 256   ->    16 x  16 x1024
  169 Shortcut Layer: 165
  170 conv    256  1 x 1 / 1    16 x  16 x1024   ->    16 x  16 x 256
  171 conv    256  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 256
  172 conv   1024  1 x 1 / 1    16 x  16 x 256   ->    16 x  16 x1024
  173 Shortcut Layer: 169
  174 conv    256  1 x 1 / 1    16 x  16 x1024   ->    16 x  16 x 256
  175 conv    256  3 x 3 / 1    16 x  16 x 256   ->    16 x  16 x 256
  176 conv   1024  1 x 1 / 1    16 x  16 x 256   ->    16 x  16 x1024
  177 Shortcut Layer: 173
  178 conv    256  1 x 1 / 1    16 x  16 x1024   ->    16 x  16 x 256

Please confirm, if this is the right observation, and I should proceed with the training?

Kind Regards Arun

AlexeyAB commented 6 years ago

@arun-kumark

Something goes wrong!

Check content of your cfg/resnet152.cfg - file, is this file resnet152?
Check that you are doing everything right
If you still get wrong result, try to use this fork to do partial: https://github.com/AlexeyAB/darknet

It should looks like this:

arun-kumark commented 6 years ago

Hi Alexey,

After forking, issue is resolved. Seems there were inconsistent files in my repository. Now right layers are displayed at the time of generation. My GPU is on the training.

192 conv 2048 1 x 1 / 1 8 x 8 x 512 -> 8 x 8 x2048 193 Shortcut Layer: 189 194 conv 512 1 x 1 / 1 8 x 8 x2048 -> 8 x 8 x 512 195 conv 512 3 x 3 / 1 8 x 8 x 512 -> 8 x 8 x 512 196 conv 2048 1 x 1 / 1 8 x 8 x 512 -> 8 x 8 x2048 197 Shortcut Layer: 193 198 conv 512 1 x 1 / 1 8 x 8 x2048 -> 8 x 8 x 512 199 conv 512 3 x 3 / 1 8 x 8 x 512 -> 8 x 8 x 512 200 conv 2048 1 x 1 / 1 8 x 8 x 512 -> 8 x 8 x2048 201 Shortcut Layer: 197 202 conv 70 1 x 1 / 1 8 x 8 x2048 -> 8 x 8 x 70 203 detection mask_scale: Using default '1.000000' Loading weights from resnet152.200...Done!

Learning Rate: 0.0001, Momentum: 0.9, Decay: 0.0005
Loaded: 0.461994 seconds
Region Avg IOU: 0.001865, Class: 0.148485, Obj: 0.295209, No Obj: 0.544053, Avg Recall: 0.000000,  count: 41
Region Avg IOU: 0.001082, Class: 0.089779, Obj: 0.531414, No Obj: 0.558429, Avg Recall: 0.000000,  count: 32
Region Avg IOU: 0.001236, Class: 0.162143, Obj: 0.469012, No Obj: 0.541242, Avg Recall: 0.000000,  count: 43
Region Avg IOU: 0.003289, Class: 0.138179, Obj: 0.354948, No Obj: 0.538169, Avg Recall: 0.000000,  count: 38
Region Avg IOU: 0.000340, Class: 0.109414, Obj: 0.237255, No Obj: 0.549358, Avg Recall: 0.000000,  count: 40
Region Avg IOU: 0.001976, Class: 0.126576, Obj: 0.408487, No Obj: 0.540027, Avg Recall: 0.000000,  count: 54
Region Avg IOU: 0.007314, Class: 0.120645, Obj: 0.289803, No Obj: 0.531240, Avg Recall: 0.000000,  count: 37
Region Avg IOU: 0.004117, Class: 0.098440, Obj: 0.432943, No Obj: 0.540761, Avg Recall: 0.000000,  count: 51
1: 2457.756592, 2457.756592 avg, 0.000000 rate, 2.658998 seconds, 128 images
Loaded: 0.000022 seconds
Region Avg IOU: 0.011552, Class: 0.083911, Obj: 0.404444, No Obj: 0.538594, Avg Recall: 0.000000,  count: 59
Region Avg IOU: 0.005468, Class: 0.159013, Obj: 0.400990, No Obj: 0.531313, Avg Recall: 0.000000,  count: 82
Region Avg IOU: 0.010414, Class: 0.187706, Obj: 0.385098, No Obj: 0.532727, Avg Recall: 0.000000,  count: 51
Region Avg IOU: 0.009937, Class: 0.060624, Obj: 0.364081, No Obj: 0.553725, Avg Recall: 0.000000,  count: 33
Region Avg IOU: 0.002738, Class: 0.094282, Obj: 0.680848, No Obj: 0.552582, Avg Recall: 0.000000,  count: 35
Region Avg IOU: 0.006391, Class: 0.166192, Obj: 0.327868, No Obj: 0.525272, Avg Recall: 0.000000,  count: 42
Region Avg IOU: 0.005366, Class: 0.125848, Obj: 0.451338, No Obj: 0.543940, Avg Recall: 0.000000,  count: 50

I will update the detection results once again, the training is completed.

Thank you.

Kind Regards Arun

arun-kumark commented 6 years ago

Hi, I tested my model yesterday evening, the prediction was good, but there is a problem with the detection. When I increase the max_crop to 640 or more, the camera detects nicely from a distance, but when I reduces this parameter in the CFG file, there very less detection from distance, but bringing the object nearer to camera give many detection (right detection but many bounding boxes). The average loss after 10000 batches was 4.700.

Then I changed the CFG file and again started the training on two GPUs, with the following configuration:

**[net] batch=128 #earlier it was 64 subdivisions=32 #earlier it was 8

height=448 #earlier it was 225 width=448 #earlier it was 225 max_crop=448 channels=3 momentum=0.9 decay=0.0005

burn_in=1000 learning_rate=0.0001 policy=poly power=4 max_batches=14000**

Now after 11K Batches, here are the logs, where the Average loss still not going below 4.7 and IOU also remains around 0.7 (Can it go to 0.9?) If yes, what modification in dataset is needed. Please guide. Below are the logs after 11K iterations:

Region Avg IOU: 0.660135, Class: 0.981329, Obj: 0.487581, No Obj: 0.010456, Avg Recall: 0.894737,  count: 19
Region Avg IOU: 0.763370, Class: 0.995517, Obj: 0.739083, No Obj: 0.006776, Avg Recall: 1.000000,  count: 5
Region Avg IOU: 0.742606, Class: 0.999824, Obj: 0.713308, No Obj: 0.005774, Avg Recall: 1.000000,  count: 2
Region Avg IOU: 0.612590, Class: 0.874384, Obj: 0.590807, No Obj: 0.013855, Avg Recall: 0.800000,  count: 20
Region Avg IOU: 0.662515, Class: 0.974764, Obj: 0.572539, No Obj: 0.007961, Avg Recall: 0.833333,  count: 12
Region Avg IOU: 0.807390, Class: 0.999769, Obj: 0.728954, No Obj: 0.006462, Avg Recall: 1.000000,  count: 6
Region Avg IOU: 0.761094, Class: 0.993665, Obj: 0.711664, No Obj: 0.018441, Avg Recall: 0.923077,  count: 13
Region Avg IOU: 0.610503, Class: 0.855781, Obj: 0.623467, No Obj: 0.006637, Avg Recall: 0.666667,  count: 9
Region Avg IOU: 0.741221, Class: 0.991534, Obj: 0.656385, No Obj: 0.015778, Avg Recall: 1.000000,  count: 20
Region Avg IOU: 0.646886, Class: 0.921648, Obj: 0.606587, No Obj: 0.008618, Avg Recall: 0.692308,  count: 13
Region Avg IOU: 0.546909, Class: 0.773235, Obj: 0.517832, No Obj: 0.009554, Avg Recall: 0.647059,  count: 17
Region Avg IOU: 0.753439, Class: 0.984664, Obj: 0.596383, No Obj: 0.012855, Avg Recall: 1.000000,  count: 15
Region Avg IOU: 0.740693, Class: 0.892905, Obj: 0.633424, No Obj: 0.012075, Avg Recall: 1.000000,  count: 11
Region Avg IOU: 0.417592, Class: 0.568096, Obj: 0.360936, No Obj: 0.003444, Avg Recall: 0.466667,  count: 15
Region Avg IOU: 0.757063, Class: 0.989477, Obj: 0.742855, No Obj: 0.005838, Avg Recall: 1.000000,  count: 7
Region Avg IOU: 0.490070, Class: 0.727844, Obj: 0.467702, No Obj: 0.002171, Avg Recall: 0.400000,  count: 5
Region Avg IOU: 0.749233, Class: 0.993034, Obj: 0.557237, No Obj: 0.006512, Avg Recall: 1.000000,  count: 9
Region Avg IOU: 0.798812, Class: 0.999825, Obj: 0.740936, No Obj: 0.003703, Avg Recall: 1.000000,  count: 1
Region Avg IOU: 0.702355, Class: 0.963193, Obj: 0.619650, No Obj: 0.012679, Avg Recall: 0.882353,  count: 17
Region Avg IOU: 0.694633, Class: 0.921360, Obj: 0.614172, No Obj: 0.013266, Avg Recall: 0.850000,  count: 20
Region Avg IOU: 0.682420, Class: 0.880997, Obj: 0.618540, No Obj: 0.013046, Avg Recall: 0.933333,  count: 15
Region Avg IOU: 0.759445, Class: 0.997710, Obj: 0.749968, No Obj: 0.009021, Avg Recall: 1.000000,  count: 10
Region Avg IOU: 0.589034, Class: 0.641790, Obj: 0.381082, No Obj: 0.006324, Avg Recall: 0.800000,  count: 5
Region Avg IOU: 0.565351, Class: 0.831913, Obj: 0.548575, No Obj: 0.008631, Avg Recall: 0.588235,  count: 17
Region Avg IOU: 0.745497, Class: 0.999130, Obj: 0.703703, No Obj: 0.006737, Avg Recall: 0.857143,  count: 7
Region Avg IOU: 0.758075, Class: 0.992727, Obj: 0.601360, No Obj: 0.019998, Avg Recall: 1.000000,  count: 15
Region Avg IOU: 0.766810, Class: 0.997058, Obj: 0.723047, No Obj: 0.013459, Avg Recall: 0.928571,  count: 14
Region Avg IOU: 0.749548, Class: 0.968871, Obj: 0.657072, No Obj: 0.009012, Avg Recall: 0.933333,  count: 15
Region Avg IOU: 0.644974, Class: 0.899949, Obj: 0.606202, No Obj: 0.011949, Avg Recall: 0.833333,  count: 12
Region Avg IOU: 0.653251, Class: 0.999973, Obj: 0.650427, No Obj: 0.002155, Avg Recall: 1.000000,  count: 1
Region Avg IOU: 0.528771, Class: 0.746722, Obj: 0.499436, No Obj: 0.010765, Avg Recall: 0.550000,  count: 20
Region Avg IOU: 0.529873, Class: 0.702395, Obj: 0.408791, No Obj: 0.007409, Avg Recall: 0.461538,  count: 13
Region Avg IOU: 0.445353, Class: 0.633009, Obj: 0.489467, No Obj: 0.010415, Avg Recall: 0.466667,  count: 15
Region Avg IOU: 0.625656, Class: 0.865984, Obj: 0.643256, No Obj: 0.022857, Avg Recall: 0.800000,  count: 25
Region Avg IOU: 0.657238, Class: 0.910786, Obj: 0.601219, No Obj: 0.011132, Avg Recall: 0.857143,  count: 14
Region Avg IOU: 0.356616, Class: 0.480169, Obj: 0.371067, No Obj: 0.006882, Avg Recall: 0.333333,  count: 9
Region Avg IOU: 0.799334, Class: 0.996106, Obj: 0.736987, No Obj: 0.009270, Avg Recall: 1.000000,  count: 5
Region Avg IOU: 0.495373, Class: 0.738899, Obj: 0.510692, No Obj: 0.008909, Avg Recall: 0.571429,  count: 7
Region Avg IOU: 0.727941, Class: 0.999592, Obj: 0.653946, No Obj: 0.006865, Avg Recall: 1.000000,  count: 6
Region Avg IOU: 0.640597, Class: 0.888973, Obj: 0.571678, No Obj: 0.009311, Avg Recall: 0.722222,  count: 18
Region Avg IOU: 0.663166, Class: 0.860342, Obj: 0.642832, No Obj: 0.006604, Avg Recall: 0.857143,  count: 7
Region Avg IOU: 0.534534, Class: 0.749022, Obj: 0.552152, No Obj: 0.007498, Avg Recall: 0.687500,  count: 16
Region Avg IOU: 0.668086, Class: 0.965958, Obj: 0.653159, No Obj: 0.009047, Avg Recall: 0.900000,  count: 10
Region Avg IOU: 0.743687, Class: 0.999275, Obj: 0.675397, No Obj: 0.002428, Avg Recall: 1.000000,  count: 5
Region Avg IOU: 0.722658, Class: 0.988107, Obj: 0.624207, No Obj: 0.014219, Avg Recall: 0.941176,  count: 17
Region Avg IOU: 0.539097, Class: 0.765993, Obj: 0.492647, No Obj: 0.006569, Avg Recall: 0.538462,  count: 13
Region Avg IOU: 0.605611, Class: 0.799359, Obj: 0.525507, No Obj: 0.007891, Avg Recall: 0.666667,  count: 12
Region Avg IOU: 0.712711, Class: 0.965310, Obj: 0.629983, No Obj: 0.018342, Avg Recall: 0.812500,  count: 16
Region Avg IOU: 0.827956, Class: 0.999183, Obj: 0.492812, No Obj: 0.009154, Avg Recall: 1.000000,  count: 5
Region Avg IOU: 0.696157, Class: 0.979458, Obj: 0.706899, No Obj: 0.017219, Avg Recall: 0.863636,  count: 22
Region Avg IOU: 0.470520, Class: 0.716070, Obj: 0.455911, No Obj: 0.002836, Avg Recall: 0.500000,  count: 8
Region Avg IOU: 0.771328, Class: 0.999320, Obj: 0.742848, No Obj: 0.004383, Avg Recall: 1.000000,  count: 1
Region Avg IOU: 0.668899, Class: 0.881764, Obj: 0.599017, No Obj: 0.013235, Avg Recall: 0.800000,  count: 10
Region Avg IOU: 0.764554, Class: 0.993098, Obj: 0.630215, No Obj: 0.015358, Avg Recall: 1.000000,  count: 15
Region Avg IOU: 0.687430, Class: 0.967540, Obj: 0.659137, No Obj: 0.016945, Avg Recall: 0.800000,  count: 20
Region Avg IOU: 0.769750, Class: 0.998442, Obj: 0.578108, No Obj: 0.008273, Avg Recall: 1.000000,  count: 4
Region Avg IOU: 0.702567, Class: 0.960225, Obj: 0.463868, No Obj: 0.004220, Avg Recall: 0.800000,  count: 5
Region Avg IOU: 0.580877, Class: 0.832473, Obj: 0.522247, No Obj: 0.009376, Avg Recall: 0.733333,  count: 15
Region Avg IOU: 0.664587, Class: 0.946535, Obj: 0.540527, No Obj: 0.008747, Avg Recall: 0.909091,  count: 11
Region Avg IOU: 0.790199, Class: 0.989789, Obj: 0.593592, No Obj: 0.017175, Avg Recall: 1.000000,  count: 11
Region Avg IOU: 0.693093, Class: 0.973041, Obj: 0.571220, No Obj: 0.004351, Avg Recall: 0.857143,  count: 7
Region Avg IOU: 0.752242, Class: 0.992657, Obj: 0.701140, No Obj: 0.012526, Avg Recall: 1.000000,  count: 9
Region Avg IOU: 0.588476, Class: 0.842634, Obj: 0.581750, No Obj: 0.006982, Avg Recall: 0.700000,  count: 10
Region Avg IOU: 0.768080, Class: 0.925007, Obj: 0.682537, No Obj: 0.006848, Avg Recall: 1.000000,  count: 8
Syncing... Done!
12072: 4.486686, 4.682826 avg, 0.000000 rate, 9.533536 seconds, 3090432 images

12072: 4.486686, 4.682826 avg, 0.000000 rate, 9.533536 seconds, 3090432 images

Thanks Kind Regards Arun

nuannuan1991 commented 6 years ago

Dear @CBIR-LL Are you sure squeezenet network is completed by darknet? I have checked that the darkne/cfg directory. It does not have squeezenet, but have resnet cfg , so I don't know where to find it.

duynn912 commented 5 years ago

Dear @AlexeyAB , Can you help me to train yolo with backbone resnet101. I was attempt to follow your instructions from your darknet but there is nothing for training with resnet101, I also tried to partial resnet101 and download weight from darknet but I got just 0.0% for both AP in each class and mAP after training over.

arun-kumark commented 5 years ago

Hi Duynn912, Please check the cfg file, which I used sometimes back for resnet152, check the last layer of CFG, hope you will get the clue to change your cfg.

[net]
# Testing
batch=1
subdivisions=1

height=448
width=448
max_crop=448
channels=3
momentum=0.9
decay=0.0005

burn_in=1000
learning_rate=0.0001
policy=poly
power=4
max_batches=14000

angle=7
hue=.1
saturation=.75
exposure=.75
aspect=.75

[convolutional]
batch_normalize=1
filters=64
size=7
stride=2
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=2
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

# Conv 4
[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=2
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

#Conv 5
[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=3
stride=2
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=2048
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=2048
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=2048
size=1
stride=1
pad=1
activation=linear

[shortcut]
from=-4
activation=leaky

[convolutional]
filters=70
size=1
stride=1
pad=1
activation=linear

[region]
anchors =  1.3221, 1.73145, 3.19275, 4.00944, 5.05587, 8.09892, 9.47112, 4.84053, 11.2364, 10.0071
bias_match=1
classes=9
coords=4
num=5
softmax=1
jitter=.2
rescore=1

object_scale=5
noobject_scale=1
class_scale=1
coord_scale=1

absolute=1
thresh = .001
random=0

duynn912 commented 5 years ago

Dear Arun Thank you for your cfg. It works, but the loss was still high and around 0.4. I tried to add a batch normalization after shortcut, the loss was lower and reached 0.1, Do you know any reason why the situation occur? Vào 15:16 Th 6, ngày 19 tháng 10 năm 2018 Arun notifications@github.com đã viết:

Hi Duynn912, Please check the cfg file, which I used sometimes back for resnet152, check the last layer of CFG, hope you will get the clue to change your cfg.

[net]

Testing

batch=1 subdivisions=1

height=448 width=448 max_crop=448 channels=3 momentum=0.9 decay=0.0005

burn_in=1000 learning_rate=0.0001 policy=poly power=4 max_batches=14000

angle=7 hue=.1 saturation=.75 exposure=.75 aspect=.75

[convolutional] batch_normalize=1 filters=64 size=7 stride=2 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=64 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=64 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=64 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=64 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=64 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=64 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=128 size=3 stride=2 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

Conv 4

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=2 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=1024 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

Conv 5

[convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=3 stride=2 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=2048 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=2048 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=2048 size=1 stride=1 pad=1 activation=linear

[shortcut] from=-4 activation=leaky

[convolutional] filters=70 size=1 stride=1 pad=1 activation=linear

[region] anchors = 1.3221, 1.73145, 3.19275, 4.00944, 5.05587, 8.09892, 9.47112, 4.84053, 11.2364, 10.0071 bias_match=1 classes=9 coords=4 num=5 softmax=1 jitter=.2 rescore=1

object_scale=5 noobject_scale=1 class_scale=1 coord_scale=1

absolute=1 thresh = .001 random=0

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pjreddie/darknet/issues/323#issuecomment-431282546, or mute the thread https://github.com/notifications/unsubscribe-auth/AlXN0N6ZDX6S77h_UZQx2Z6i1Mvurt8aks5umYpVgaJpZM4QjtLx .

arun-kumark commented 5 years ago

I am not sure if these links and procedure works still, I am sharing from my old notes (Apr-2018). I hope if you have created the partial file before starting the training...

PATH OF THE PARTIAL FILE

Issue: https://github.com/pjreddie/darknet/issues/323 https://pjreddie.com/media/files/resnet152.weights

CREATION OF THE PARTIAL FILE

./darknet partial cfg/cf_resnet152.cfg weights/resnet152.weights resnet152.200 200

TRAIN ON MULTIPLE GPUS

./darknet detector train cfg/cf_voc.data cfg/cf_resnet152-train.cfg resnet152.200 -gpu 0,1

duynn912 commented 5 years ago

Hi Arun,

I have followed to do the steps like you mentioned but I did not understand the situation I got.

Vào Th 2, 22 thg 10, 2018 vào lúc 14:53 Arun notifications@github.com đã viết:

I am not sure if these links and procedure works still, I am sharing from my old notes (Apr-2018). I hope if you have created the partial file before starting the training...

PATH OF THE PARTIAL FILE

Issue: #323 https://github.com/pjreddie/darknet/issues/323 https://pjreddie.com/media/files/resnet152.weights

CREATION OF THE PARTIAL FILE

./darknet partial cfg/cf_resnet152.cfg weights/resnet152.weights resnet152.200 200

TRAIN ON MULTIPLE GPUS

./darknet detector train cfg/cf_voc.data cfg/cf_resnet152-train.cfg resnet152.200 -gpu 0,1

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pjreddie/darknet/issues/323#issuecomment-431761719, or mute the thread https://github.com/notifications/unsubscribe-auth/AlXN0PKOnXL1SFw23e9tHtBTalvAGG6Yks5unXlngaJpZM4QjtLx .

prateekgupta891 commented 5 years ago

Hi, I wanted to modify the network architecture of the YOLOv2 like using a few layers, removing a few layers and then adding some new layers to the YOLOv2 network. So i was wondering is it possible to do so by just making a few changes in the .cfg and running the command .\darknet detector train %datafile.data% %cfgfile.cfg%

mtlcrusher commented 4 years ago

Hi, I wanted to modify the network architecture of the YOLOv2 like using a few layers, removing a few layers and then adding some new layers to the YOLOv2 network. So i was wondering is it possible to do so by just making a few changes in the .cfg and running the command .\darknet detector train %datafile.data% %cfgfile.cfg%

hello prateekgupta891 yes you can do it, it mentioned here

For training Yolo based on other models (DenseNet201-Yolo or ResNet50-Yolo), you can download and get pre-trained weights as showed in this file: https://github.com/AlexeyAB/darknet/blob/master/build/darknet/x64/partial.cmd If you made you custom model that isn't based on other models, then you can train it without pre-trained weights, then will be used random initial weights.

zunairaR commented 4 years ago

@mtlcrusher I tried using yolov3 with rsnet50 backbone, by following the steps mentioned in the link you shared. but im getting this error. they say it is because of weighted shortcut layer update. can u help me to resolve it. error-yolo

mtlcrusher commented 4 years ago

@mtlcrusher I tried using yolov3 with rsnet50 backbone, by following the steps mentioned in the link you shared. but im getting this error. they say it is because of weighted shortcut layer update. can u help me to resolve it.

i think, it's because the dimension didn't match.. rechecked the model may solve it

hussienWehbi commented 3 years ago

@AurusHuang did you find an answer if you can train without any pre-trained weights and train from scratch ?

mtlcrusher commented 3 years ago

It's been 3 years, i think this issue need to be closed

anandakevin commented 3 years ago

@AurusHuang did you find an answer if you can train without any pre-trained weights and train from scratch ?

yes you can, just run this command darknet detector train %datafile.data% %cfgfile.cfg%

if you want to include mAP calculation, use the option -map like this darknet detector train %datafile.data% %cfgfile.cfg% -map

Hope these help your day :)

htdung167 commented 2 years ago

Hi, I wanted to modify the network architecture of the YOLOv2 like using a few layers and adding some new layers to the last YOLOv2 network. So i was wondering is it possible to use pretrained for top layers of YOLOv2? It like fine-tune in tensorflow..

ayobamiakomolafe commented 2 years ago

I am working on a research that requires me changing yolo v4 backbone from cspdarknet to efficientnet but don't know how to adjust the cfg file to achieve that. Would appreciate if anyone could provide some insights into that

htdung167 commented 2 years ago

efficientnet but don't know how to adjust the cfg file to achieve that.

you can edit the cfg file with layers referenced in the wiki ( https://github.com/AlexeyAB/darknet/wiki ).

ayobamiakomolafe commented 2 years ago

Thanks, but what exact parameters do i need to adjust and to what values

ayobamiakomolafe commented 2 years ago

Thanks, but what exact parameters do i need to adjust and to what values

htdung167 commented 2 years ago

Thanks, but what exact parameters do i need to adjust and to what values

you can change the values in one of the cfg files in the cfg folder or you can write from scratch based on the article

pjreddie / darknet