AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.75k stars 7.96k forks source link

How to train custom objects based on yolo-resnet152? #432

Closed anguoyang closed 6 years ago

anguoyang commented 6 years ago

Hi@AlexeyAB, I have trained the yolo v2 with custom objects, it seems not so accurate, it is possibly because of my images for 1 object are always different. for example, these 2 images are belong to the same category, we took photo on 2 sides of the snack: https://imgur.com/a/huw7x https://imgur.com/a/CiprO

and we have many similar objects, which images on both sides are totally different, however, we need to detect them as one object.

I currently used Yolo Mark which is also developed by you, there is training tutorial on yolo v2, however, I could not find any document on how to train with yolo-resnet152, I found this command: darknet.exe partial resnet152.cfg resnet152.weights resnet152.201 201 however, how can I input my custom images? could I just modified to this?: darknet.exe partial data/img resnet152.cfg resnet152.weights resnet152.201 201

Thank you.

AlexeyAB commented 6 years ago

@anguoyang Hi, resnet152 for detection is here /build/darknet/x64/resnet152_yolo.cfg

  1. Download this file: https://pjreddie.com/media/files/resnet152.weights

  2. Do darknet.exe partial cfg/resnet152.cfg resnet152.weights resnet152.201 201

  3. Change these lines: https://github.com/AlexeyAB/darknet/blob/100d6f78011f0a773442411e2882a0203d390585/build/darknet/x64/resnet152_yolo.cfg#L1-L23

To these lines: https://github.com/AlexeyAB/darknet/blob/100d6f78011f0a773442411e2882a0203d390585/build/darknet/x64/yolo-voc.2.0.cfg#L1-L18

  1. Change number of classes: https://github.com/AlexeyAB/darknet/blob/100d6f78011f0a773442411e2882a0203d390585/build/darknet/x64/resnet152_yolo.cfg#L1463 And number of fileters as usual, filters=(classes+5)*num_of_anchors: https://github.com/AlexeyAB/darknet/blob/100d6f78011f0a773442411e2882a0203d390585/build/darknet/x64/resnet152_yolo.cfg#L1456

  2. Remove this line (to do transfer-learning instead of fine-tuning): https://github.com/AlexeyAB/darknet/blob/100d6f78011f0a773442411e2882a0203d390585/build/darknet/x64/resnet152_yolo.cfg#L1439

  3. Run training: darknet.exe detector train data/obj.data resnet152_yolo.cfg resnet152.201


If it lead to Nan, then you can try to leave this line to do fine-tuning: https://github.com/AlexeyAB/darknet/blob/100d6f78011f0a773442411e2882a0203d390585/build/darknet/x64/resnet152_yolo.cfg#L1439

anguoyang commented 6 years ago

thank you so much for your great help! I suppose that the densenet-yolo use the similar procedure?

AlexeyAB commented 6 years ago

@anguoyang Yes,

  1. Just download: https://pjreddie.com/media/files/densenet201.weights

  2. Do darknet.exe partial cfg/densenet201.cfg densenet201.weights densenet201.300 300

...

anguoyang commented 6 years ago

Ok, thank you, it is running now, one more question, if finished the training, can I use the same interface for detection?

AlexeyAB commented 6 years ago

Yes, you can use the same command for detection:

darknet.exe detector test obj.data resnet152_yolo.cfg resnet152_yolo_2000.weights -thresh 0.1

lixiangchun commented 6 years ago

@AlexeyAB Thanks for your great work and I did learn a lot from your code. I have a question how to train resnet50-yolo. From your code above, I can see that the first 2 steps should be:

  1. wget https://pjreddie.com/media/files/resnet50.weights

  2. darknet partial cfg/resnet50_yolo.cfg resnet50.weights resnet50.xx xx. I get stuck in this step how to choose the last parameters (i.e. xx) passed to partial cmd.

    How to set last parameters (i.e. xx) in partial cmd for resnet50-yolo?

AlexeyAB commented 6 years ago

@lixiangchun Hi,

You can see it here: https://github.com/AlexeyAB/darknet/blob/a6c51e3b758aee7fd3a6f1d37daa8dcad4891e52/build/darknet/x64/partial.cmd#L25

Why is 65?

Just run resnet50.cfg as classifier: https://github.com/AlexeyAB/darknet/blob/a6c51e3b758aee7fd3a6f1d37daa8dcad4891e52/build/darknet/x64/classifier_resnet50.cmd#L1

and see the number of penultimate convolutional layer 64. So you should extract layers [0 - 64], i.e. 65 layers.

In this way for training detector, 65 layers will be loaded from pre-trained resnet50.65 and last convolutional layer will be initialized using random values: https://github.com/AlexeyAB/darknet/blob/a6c51e3b758aee7fd3a6f1d37daa8dcad4891e52/src/convolutional_layer.c#L246

image

lixiangchun commented 6 years ago

@AlexeyAB Thanks for your prompt reply, I got it right now.

lixiangchun commented 6 years ago

@AlexeyAB When I training resnet50_yolo with random=1 in cfg file, the following error occurred:

Cannot resize this type of layer: File exists darknet: ./src/utils.c:199: error: Assertion '0' failed.

If random=0, resnet50_yolo works.

AlexeyAB commented 6 years ago

@lixiangchun Yes, [shortcut] layer doesn't support resize yet. I will fix it.

anguoyang commented 6 years ago

Hi@AlexeyAB , I have tried to test on yolo, densenet and resnet, I found they are all not sensitive to color difference, which means, if two objects are similar in shape but different in color, then it is difficult to distinguish from each other, could you please give me some advice on how to improve it? thank you

AlexeyAB commented 6 years ago

@anguoyang Probably due to data augumentation.

Set these params and train: https://github.com/AlexeyAB/darknet/blob/15c89e7a714e7e37c13618eace9325a06f0642fc/cfg/yolo-voc.2.0.cfg#L10-L12

saturation = 1.01
exposure = 1.5
hue=.01

Can you show examples of colors that the network can not distinguish?

anguoyang commented 6 years ago

hi@AlexeyAB , thank you a lot for your quick response. I have uploaded 2 images which is similar in shape and different in color: https://imgur.com/a/02Ngo https://imgur.com/a/dZN8g

anguoyang commented 6 years ago

I have modified the cfg file and re-train the yolo v2 network, it seems no improvement, maybe it is because that I have only trained 2000 iterations? is there any other factors which lead to loss of color information? thank you

anguoyang commented 6 years ago

I have uploaded the whole image data directory: https://github.com/anguoyang/fpdw4win/raw/master/data.zip you could download and train/test it with yolo or other nets

AlexeyAB commented 6 years ago

Oh, your colors are very close to each other. So you should use to train:

saturation = 1.0
exposure = 1.0
hue=0.0

Thus, the colors will not change during training. And try to train more than 2000 iterations.

anguoyang commented 6 years ago

Hi@AlexeyAB, I have modified the cfg according to your advice, it is better than the original one, however, it more better(only for my case) to set with: saturation = 0.1 exposure = 0.1 hue=0.1

I am trying to set it into this and training: saturation = 0.0 exposure = 0.0 hue=0.0

If finished, I will go back here for the result:)

lixiangchun commented 6 years ago

Hi @AlexeyAB, is that okay to do center-cropping before any data augmentation? If yes, how to specify it in cfg file, I haven't yet found this parameter.

AlexeyAB commented 6 years ago

@lixiangchun Yes, it is ok, you can se jitter= param from 0 to 1: https://github.com/AlexeyAB/darknet/blob/df076653e00db69c6fb57869981f5196a8f55e70/cfg/yolo-voc.2.0.cfg#L234

It's done here in the code: https://github.com/AlexeyAB/darknet/blob/df076653e00db69c6fb57869981f5196a8f55e70/src/data.c#L689-L704

AlexeyAB commented 6 years ago

@anguoyang Do these params saturation = 0.1 exposure = 0.1 hue=0.1 give you better results than saturation = 1.0 exposure = 1.0 hue=0.0?

This is strange, because exactly these params do not change colors at all during data augumentation, so it can distinguish colors better: https://github.com/AlexeyAB/darknet/blob/df076653e00db69c6fb57869981f5196a8f55e70/src/image.c#L1246-L1267


Functions: rand_scale() and rand_uniform_strong()

https://github.com/AlexeyAB/darknet/blob/df076653e00db69c6fb57869981f5196a8f55e70/src/utils.c#L615-L620 https://github.com/AlexeyAB/darknet/blob/df076653e00db69c6fb57869981f5196a8f55e70/src/utils.c#L654-L662

anguoyang commented 6 years ago

yes, it's true, it can distinguish the 2 images after modified all to 0.1, the iteration is almost same, I don't know why

anguoyang commented 6 years ago

you could try my dataset on training and testing, as I have already marked all labels, so maybe will not cost so much time, you could just train it and see the result

AlexeyAB commented 6 years ago

@AlexeyAB When I training resnet50_yolo with random=1 in cfg file, the following error occurred:

Cannot resize this type of layer: File exists darknet: ./src/utils.c:199: error: Assertion '0' failed.

If random=0, resnet50_yolo works.

@lixiangchun I added resize_shortcut_layer(). So now you can use random=1 for the resnet50_yolo.cfg and resnet152_yolo.cfg.

lixiangchun commented 6 years ago

Hi @AlexeyAB, Thanks for your great work. I will try it soon.

lixiangchun commented 6 years ago

Hi @AlexeyAB, When training resnet50_yolo with random = 1, error occurs:

darknet: ./src/shortcut_layer.c:41: resize_shortcut_layer: Assertionl->w == l->out_w' failed.`

AlexeyAB commented 6 years ago

@lixiangchun Hi, try to do

make clean
make -j8

I trained this model for about 300 iterations, and it did not lead to an error: resnet50_yolo.zip

darknet.exe detector train data/voc_air.data resnet50_yolo.cfg resnet50.65

lixiangchun commented 6 years ago

@AlexeyAB I cloned your latest commit and I also encountered the same error by using your resnet50_yolo.cfg at the beginning after loading model. Could your provide link to your resnet50.65?

AlexeyAB commented 6 years ago

@lixiangchun You can get file resnet50.65 using this command: ./darknet partial cfg/resnet50.cfg resnet50.weights resnet50.65 65 Before you should download: https://pjreddie.com/media/files/resnet50.weights


Hot to get pre-trained files for other models: https://github.com/AlexeyAB/darknet/blob/master/build/darknet/x64/partial.cmd

TaihuLight commented 6 years ago

@anguoyang @AlexeyAB @lixiangchun Cloud you release the stable linux version of code and trained weights regarding detection and classification, for training and testing resnet50, resnet101, resnet152 and densenet201. Thus, you can save more time to coding new functions.

AlexeyAB commented 6 years ago

@TaihuLight

Cloud you release the stable linux version of code

What do you mean? Current commit is almost stable, and there is stable version: https://github.com/AlexeyAB/darknet/releases


trained weights regarding detection and classification, for training and testing resnet50, resnet101, resnet152 and densenet201.

There are trained weights for classification: https://pjreddie.com/darknet/imagenet/#pretrained

Then for training your own wegiths you can use pre-trained weights that you can get by launch this file: https://github.com/AlexeyAB/darknet/blob/master/build/darknet/x64/partial.cmd

Also these cfg-files aren't well tested as detector and slow:

So we are waiting for Yolo v3: https://github.com/pjreddie/darknet/tree/yolov3

anguoyang commented 6 years ago

Hi@AlexeyAB, based on my testing/experiences, the source code is stable enough, thank you. What troubles me is the accuracy and also the maximum objects number. I have also tested on densenet, it is better than yolo on accuracy, but not good enough, for example: I have trained and tested with 5 objects, each object with about 8 images, the testing on these objects is not bad, however, if I put another more object for testing(which is different with trained objects in shape and color), sometime it also get 40% confidence, which is very depressing

TaihuLight commented 6 years ago

I have the same problem with @anguoyang , stable version means that the code can be used to train weights and get better accuracy on the specific dataset, not only run successfully. Thus, you can help more learners. Besides, I need resnet101.cfg for classification, where can I get it or I need to edit it myself?

AlexeyAB commented 6 years ago

@TaihuLight You need to cut off several layers from resnet152.cfg by yourself.

AlexeyAB commented 6 years ago

@anguoyang

I have trained and tested with 5 objects, each object with about 8 images

Do you mean that your training dataset contains only 8 images per class? This is very few.

What mAP and IoU can you get?

Also you can read: https://github.com/AlexeyAB/darknet#how-to-improve-object-detection

anguoyang commented 6 years ago

hi@AlexeyAB , but we could only take photo on 8 directions, do you mean that we take more photos on each direction? the background is the same

anguoyang commented 6 years ago

Hi@AlexeyAB , I have also tried to modify the cfg file according to https://github.com/AlexeyAB/darknet#how-to-improve-object-detection including random=1, add negative samples, etc. but didnt work, maybe the only thing I need to do is to add more images for each object

AlexeyAB commented 6 years ago

Yes, more images. And this: desirable that your training dataset include images with objects at diffrent: scales, rotations, lightings, from different sides.

Training dataset should contain images with all scales, rotations, lightings, sides - which you want to detect.

anguoyang commented 6 years ago

@AlexeyAB, we actually need to detect objects with fixed scales, lightings and rotations, and even the background is same, so we took photo on both sides and 4 directions, so we actually need only 8 images by far:)

sivagnanamn commented 6 years ago

@anguoyang Even though you need to detect objects with the fixed scales & same background, you need more variety in your training data. If you're training with just 8 images per class, your network will easily overfit to your training image & not learn features of the objects present in the train data.

You can improve the performance your NN by providing the same objects in different background, scales, lighting etc. If you cannot afford to collect such data, you can try augmentation.

anguoyang commented 6 years ago

@sivagnanamn , thank you, I will try more images.

TaihuLight commented 6 years ago

@lixiangchun @AlexeyAB I also encountered the same error by using your resnet50_yolo.cfg provided in this issue at the beginning after loading model when OEPNCV=0. but I have generate resnet50.65 with correct commmand. Does you solve it?

But if random=0, resnet50_yolo works.

$./darknet partial cfg/resnet50.cfg resnet50.weights resnet50.65 65 layer filters size input output 0 conv 64 7 x 7 / 2 256 x 256 x 3 -> 128 x 128 x 64 1 max 2 x 2 / 2 128 x 128 x 64 -> 64 x 64 x 64 2 conv 64 1 x 1 / 1 64 x 64 x 64 -> 64 x 64 x 64 3 conv 64 3 x 3 / 1 64 x 64 x 64 -> 64 x 64 x 64 4 conv 256 1 x 1 / 1 64 x 64 x 64 -> 64 x 64 x 256 5 Shortcut Layer: 1 6 conv 64 1 x 1 / 1 64 x 64 x 256 -> 64 x 64 x 64 7 conv 64 3 x 3 / 1 64 x 64 x 64 -> 64 x 64 x 64 8 conv 256 1 x 1 / 1 64 x 64 x 64 -> 64 x 64 x 256 9 Shortcut Layer: 5 10 conv 64 1 x 1 / 1 64 x 64 x 256 -> 64 x 64 x 64 11 conv 64 3 x 3 / 1 64 x 64 x 64 -> 64 x 64 x 64 12 conv 256 1 x 1 / 1 64 x 64 x 64 -> 64 x 64 x 256 13 Shortcut Layer: 9 14 conv 128 1 x 1 / 1 64 x 64 x 256 -> 64 x 64 x 128 15 conv 128 3 x 3 / 2 64 x 64 x 128 -> 32 x 32 x 128 16 conv 512 1 x 1 / 1 32 x 32 x 128 -> 32 x 32 x 512 17 Shortcut Layer: 13 18 conv 128 1 x 1 / 1 32 x 32 x 512 -> 32 x 32 x 128 19 conv 128 3 x 3 / 1 32 x 32 x 128 -> 32 x 32 x 128 20 conv 512 1 x 1 / 1 32 x 32 x 128 -> 32 x 32 x 512 21 Shortcut Layer: 17 22 conv 128 1 x 1 / 1 32 x 32 x 512 -> 32 x 32 x 128 23 conv 128 3 x 3 / 1 32 x 32 x 128 -> 32 x 32 x 128 24 conv 512 1 x 1 / 1 32 x 32 x 128 -> 32 x 32 x 512 25 Shortcut Layer: 21 26 conv 128 1 x 1 / 1 32 x 32 x 512 -> 32 x 32 x 128 27 conv 128 3 x 3 / 1 32 x 32 x 128 -> 32 x 32 x 128 28 conv 512 1 x 1 / 1 32 x 32 x 128 -> 32 x 32 x 512 29 Shortcut Layer: 25 30 conv 256 1 x 1 / 1 32 x 32 x 512 -> 32 x 32 x 256 31 conv 256 3 x 3 / 2 32 x 32 x 256 -> 16 x 16 x 256 32 conv 1024 1 x 1 / 1 16 x 16 x 256 -> 16 x 16 x1024 33 Shortcut Layer: 29 34 conv 256 1 x 1 / 1 16 x 16 x1024 -> 16 x 16 x 256 35 conv 256 3 x 3 / 1 16 x 16 x 256 -> 16 x 16 x 256 36 conv 1024 1 x 1 / 1 16 x 16 x 256 -> 16 x 16 x1024 37 Shortcut Layer: 33 38 conv 256 1 x 1 / 1 16 x 16 x1024 -> 16 x 16 x 256 39 conv 256 3 x 3 / 1 16 x 16 x 256 -> 16 x 16 x 256 40 conv 1024 1 x 1 / 1 16 x 16 x 256 -> 16 x 16 x1024 41 Shortcut Layer: 37 42 conv 256 1 x 1 / 1 16 x 16 x1024 -> 16 x 16 x 256 43 conv 256 3 x 3 / 1 16 x 16 x 256 -> 16 x 16 x 256 44 conv 1024 1 x 1 / 1 16 x 16 x 256 -> 16 x 16 x1024 45 Shortcut Layer: 41 46 conv 256 1 x 1 / 1 16 x 16 x1024 -> 16 x 16 x 256 47 conv 256 3 x 3 / 1 16 x 16 x 256 -> 16 x 16 x 256 48 conv 1024 1 x 1 / 1 16 x 16 x 256 -> 16 x 16 x1024 49 Shortcut Layer: 45 50 conv 256 1 x 1 / 1 16 x 16 x1024 -> 16 x 16 x 256 51 conv 256 3 x 3 / 1 16 x 16 x 256 -> 16 x 16 x 256 52 conv 1024 1 x 1 / 1 16 x 16 x 256 -> 16 x 16 x1024 53 Shortcut Layer: 49 54 conv 512 1 x 1 / 1 16 x 16 x1024 -> 16 x 16 x 512 55 conv 512 3 x 3 / 2 16 x 16 x 512 -> 8 x 8 x 512 56 conv 2048 1 x 1 / 1 8 x 8 x 512 -> 8 x 8 x2048 57 Shortcut Layer: 53 58 conv 512 1 x 1 / 1 8 x 8 x2048 -> 8 x 8 x 512 59 conv 512 3 x 3 / 1 8 x 8 x 512 -> 8 x 8 x 512 60 conv 2048 1 x 1 / 1 8 x 8 x 512 -> 8 x 8 x2048 61 Shortcut Layer: 57 62 conv 512 1 x 1 / 1 8 x 8 x2048 -> 8 x 8 x 512 63 conv 512 3 x 3 / 1 8 x 8 x 512 -> 8 x 8 x 512 64 conv 2048 1 x 1 / 1 8 x 8 x 512 -> 8 x 8 x2048 65 Shortcut Layer: 61 66 conv 1000 1 x 1 / 1 8 x 8 x2048 -> 8 x 8 x1000 67 avg 8 x 8 x1000 -> 1000 68 softmax 1000 69 cost 1000 Loading weights from resnet50.weights... seen 64 Done! Saving weights to resnet50.65 $ ./darknet detector train data/voc.data cfg/resnet50_yolo.cfg resnet50.65 resnet50_yolo layer filters size input output 0 conv 64 7 x 7 / 2 416 x 416 x 3 -> 208 x 208 x 64 1 max 2 x 2 / 2 208 x 208 x 64 -> 104 x 104 x 64 2 conv 64 1 x 1 / 1 104 x 104 x 64 -> 104 x 104 x 64 3 conv 64 3 x 3 / 1 104 x 104 x 64 -> 104 x 104 x 64 4 conv 256 1 x 1 / 1 104 x 104 x 64 -> 104 x 104 x 256 5 Shortcut Layer: 1 6 conv 64 1 x 1 / 1 104 x 104 x 256 -> 104 x 104 x 64 7 conv 64 3 x 3 / 1 104 x 104 x 64 -> 104 x 104 x 64 8 conv 256 1 x 1 / 1 104 x 104 x 64 -> 104 x 104 x 256 9 Shortcut Layer: 5 10 conv 64 1 x 1 / 1 104 x 104 x 256 -> 104 x 104 x 64 11 conv 64 3 x 3 / 1 104 x 104 x 64 -> 104 x 104 x 64 12 conv 256 1 x 1 / 1 104 x 104 x 64 -> 104 x 104 x 256 13 Shortcut Layer: 9 14 conv 128 1 x 1 / 1 104 x 104 x 256 -> 104 x 104 x 128 15 conv 128 3 x 3 / 2 104 x 104 x 128 -> 52 x 52 x 128 16 conv 512 1 x 1 / 1 52 x 52 x 128 -> 52 x 52 x 512 17 Shortcut Layer: 13 18 conv 128 1 x 1 / 1 52 x 52 x 512 -> 52 x 52 x 128 19 conv 128 3 x 3 / 1 52 x 52 x 128 -> 52 x 52 x 128 20 conv 512 1 x 1 / 1 52 x 52 x 128 -> 52 x 52 x 512 21 Shortcut Layer: 17 22 conv 128 1 x 1 / 1 52 x 52 x 512 -> 52 x 52 x 128 23 conv 128 3 x 3 / 1 52 x 52 x 128 -> 52 x 52 x 128 24 conv 512 1 x 1 / 1 52 x 52 x 128 -> 52 x 52 x 512 25 Shortcut Layer: 21 26 conv 128 1 x 1 / 1 52 x 52 x 512 -> 52 x 52 x 128 27 conv 128 3 x 3 / 1 52 x 52 x 128 -> 52 x 52 x 128 28 conv 512 1 x 1 / 1 52 x 52 x 128 -> 52 x 52 x 512 29 Shortcut Layer: 25 30 conv 256 1 x 1 / 1 52 x 52 x 512 -> 52 x 52 x 256 31 conv 256 3 x 3 / 2 52 x 52 x 256 -> 26 x 26 x 256 32 conv 1024 1 x 1 / 1 26 x 26 x 256 -> 26 x 26 x1024 33 Shortcut Layer: 29 34 conv 256 1 x 1 / 1 26 x 26 x1024 -> 26 x 26 x 256 35 conv 256 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 256 36 conv 1024 1 x 1 / 1 26 x 26 x 256 -> 26 x 26 x1024 37 Shortcut Layer: 33 38 conv 256 1 x 1 / 1 26 x 26 x1024 -> 26 x 26 x 256 39 conv 256 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 256 40 conv 1024 1 x 1 / 1 26 x 26 x 256 -> 26 x 26 x1024 41 Shortcut Layer: 37 42 conv 256 1 x 1 / 1 26 x 26 x1024 -> 26 x 26 x 256 43 conv 256 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 256 44 conv 1024 1 x 1 / 1 26 x 26 x 256 -> 26 x 26 x1024 45 Shortcut Layer: 41 46 conv 256 1 x 1 / 1 26 x 26 x1024 -> 26 x 26 x 256 47 conv 256 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 256 48 conv 1024 1 x 1 / 1 26 x 26 x 256 -> 26 x 26 x1024 49 Shortcut Layer: 45 50 conv 256 1 x 1 / 1 26 x 26 x1024 -> 26 x 26 x 256 51 conv 256 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 256 52 conv 1024 1 x 1 / 1 26 x 26 x 256 -> 26 x 26 x1024 53 Shortcut Layer: 49 54 conv 512 1 x 1 / 1 26 x 26 x1024 -> 26 x 26 x 512 55 conv 512 3 x 3 / 2 26 x 26 x 512 -> 13 x 13 x 512 56 conv 2048 1 x 1 / 1 13 x 13 x 512 -> 13 x 13 x2048 57 Shortcut Layer: 53 58 conv 512 1 x 1 / 1 13 x 13 x2048 -> 13 x 13 x 512 59 conv 512 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x 512 60 conv 2048 1 x 1 / 1 13 x 13 x 512 -> 13 x 13 x2048 61 Shortcut Layer: 57 62 conv 512 1 x 1 / 1 13 x 13 x2048 -> 13 x 13 x 512 63 conv 512 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x 512 64 conv 2048 1 x 1 / 1 13 x 13 x 512 -> 13 x 13 x2048 65 Shortcut Layer: 61 66 conv 1024 1 x 1 / 1 13 x 13 x2048 -> 13 x 13 x1024 67 conv 125 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 125 68 detection mask_scale: Using default '1.000000' Loading weights from resnet50.65... seen 32 Done! Learning Rate: 0.0001, Momentum: 0.9, Decay: 0.0005 Resizing network 416 darknet: ./src/shortcut_layer.c:41: resize_shortcut_layer: Assertion `l->w == l->out_w' failed. Aborted

AlexeyAB commented 6 years ago

@TaihuLight Yes, this is solved, try to train using last version of build/darknet/x64/resnet50_yolo.cfg https://raw.githubusercontent.com/AlexeyAB/darknet/2a9f7e44ce1b73d3d56ef83f83e94f074ecac3f9/build/darknet/x64/resnet50_yolo.cfg

I can train it successfully:

image

image

yanhn commented 6 years ago

Hello @AlexeyAB I've read the post above carefully. And I do as you suggested but I still got the error darknet: ./src/shortcut_layer.c:41: resize_shortcut_layer: Assertion 'l->w == l->out_w' failed. image

Here's what I did:

  1. clone the latest code.
  2. modify the resnet50_yolo.cfg file. looks like: https://github.com/yanhn/testFile/blob/master/resnet50_yolo.cfg
  3. ./darknet partial cfg/resnet50.cfg cfg/resnet50.weights self_cfg/resnet50.65 65
  4. ./darknet detector train self_cfg/video.data self_cfg/resnet50_yolo.cfg self_cfg/resnet50.65 -gpus 0 -dont_show

And I also trained with resnet152 trained with pjreddie's darknet, both the same error. Anything will be helpful, thanks.

AlexeyAB commented 6 years ago

@yanhn Try to remove old version of Darknet and download it again.

I just run this command and I can train your https://github.com/yanhn/testFile/blob/master/resnet50_yolo.cfg file successfully: darknet.exe detector train self_cfg/video.data self_cfg/resnet50_yolo.cfg self_cfg/resnet50.65 -gpus 0 -dont_show

image

image

yanhn commented 6 years ago

Thank you. I managed to trained the model on windows platform, which proves that my data is correct and the partial resnet50 model is correct. But still got the error on ubuntu.

I use the latest code with commit id 0fe1c6bcc86edc649624d655643627e20d02eba9 And I changed opencv version from opencv3.3 3.4 and 3.4.1, but still the same error.

As for my ubuntu environment, I managed to train a yolov3 & tiny yolo model using my own data. So I think the environment is ok. I printed some log maybe helpful. I added printf("input: %d, output: %d\ninwidth: %d, inheight: %d, outwidth: %d, outheight: %d\n", l->inputs, l->outputs, l->w, l->h, l->out_w, l->out_h); in line 41 of shortcut_layer.c. And here's the output: image

I solved it by comment the assert phrase. Can train the model for now, but I need to check it later.

AlexeyAB commented 6 years ago

@yanhn Did you totally remove Darknet from Ubuntu, and did you remove your old cfg-file from any places on Ubuntu? Please check it twice.

yanhn commented 6 years ago

@AlexeyAB No. I just pull the latest code. I'll try as you suggest. And by the way, does the old cfg-file indicate the resnet50_yolo.cfg?

AlexeyAB commented 6 years ago

@yanhn Yes, you should just comment these lines with assert(). I fixed it: https://github.com/AlexeyAB/darknet/commit/16cfff811f8a5898899cdd0b7139d216466371d2

https://github.com/AlexeyAB/darknet/blob/16cfff811f8a5898899cdd0b7139d216466371d2/src/shortcut_layer.c#L39-L42


On Windows asserts disabled for Release mode, so I didn't see this errors. Now I check it on Linux, and asserts should be removed.

yanhn commented 6 years ago

Ok, thank you.

anguoyang commented 6 years ago

@AlexeyAB @sivagnanamn Just as we discussed before, 8 images for each item is too few, we tried our best to take about 500 images for each item(currently we could only afford to get images for 8 items). the problems is, when I followed the instruction on yolo v3 for custom objects, my 1080TI machine always generate cuda - out of memory error, my question is: how to calculate the image requirement? total images size x batch ?