Random silent crashes after Resizing

ShAlexanderGo commented 6 years ago

Hello. I use dataset from https://timebutt.github.io/static/how-to-train-yolov2-to-detect-custom-objects/ (about 300 images), config file based on yolov3-tiny_xnor.cfg and no gpu version of darknet. However when I run training darknet randomly exits after resizing (even if it starts training after 20 iterations another resizing occurs and it can fail). What could be the problem?

PS C:\123\installation\darknet\build\darknet\x64> darknet detector train .\cfg\obj.data .\cfg\yolo-obj.cfg .\yolov3-tiny
.conv.15
yolo-obj
layer     filters    size              input                output
   0 conv     16  3 x 3 / 1   416 x 416 x   3   ->   416 x 416 x  16 0.150 BF
   1 max          2 x 2 / 2   416 x 416 x  16   ->   208 x 208 x  16 0.003 BF
   2 convX    32  3 x 3 / 1   208 x 208 x  16   ->   208 x 208 x  32 0.399 BF
   3 max          2 x 2 / 2   208 x 208 x  32   ->   104 x 104 x  32 0.001 BF
   4 convX    64  3 x 3 / 1   104 x 104 x  32   ->   104 x 104 x  64 0.399 BF
   5 max          2 x 2 / 2   104 x 104 x  64   ->    52 x  52 x  64 0.001 BF
   6 convX   128  3 x 3 / 1    52 x  52 x  64   ->    52 x  52 x 128 0.399 BF
   7 max          2 x 2 / 2    52 x  52 x 128   ->    26 x  26 x 128 0.000 BF
   8 convX   256  3 x 3 / 1    26 x  26 x 128   ->    26 x  26 x 256 0.399 BF
   9 max          2 x 2 / 2    26 x  26 x 256   ->    13 x  13 x 256 0.000 BF
  10 convX   512  3 x 3 / 1    13 x  13 x 256   ->    13 x  13 x 512 0.399 BF
  11 max          2 x 2 / 1    13 x  13 x 512   ->    13 x  13 x 512 0.000 BF
  12 convX  1024  3 x 3 / 1    13 x  13 x 512   ->    13 x  13 x1024 1.595 BF
  13 convX   256  1 x 1 / 1    13 x  13 x1024   ->    13 x  13 x 256 0.089 BF
  14 conv    512  3 x 3 / 1    13 x  13 x 256   ->    13 x  13 x 512 0.399 BF
  15 conv     18  1 x 1 / 1    13 x  13 x 512   ->    13 x  13 x  18 0.003 BF
  16 yolo
  17 route  13
  18 convX   128  1 x 1 / 1    13 x  13 x 256   ->    13 x  13 x 128 0.011 BF
  19 upsample            2x    13 x  13 x 128   ->    26 x  26 x 128
  20 route  19 8
  21 convX   256  3 x 3 / 1    26 x  26 x 384   ->    26 x  26 x 256 1.196 BF
  22 conv     18  1 x 1 / 1    26 x  26 x 256   ->    26 x  26 x  18 0.006 BF
  23 yolo
Total BFLOPS 5.448
Loading weights from .\yolov3-tiny.conv.15...
 seen 32
Done!
Learning Rate: 0.001, Momentum: 0.9, Decay: 0.0005
 If error occurs - run training with flag: -dont_show
Resizing
576 x 576
Loaded: 0.000000 seconds
 Used FMA & AVX2
 Used AVX
PS C:\123\installation\darknet\build\darknet\x64>

As I said sometimes it works fine:

 Used FMA & AVX2
 Used AVX
Region 16 Avg IOU: 0.475140, Class: 0.472502, Obj: 0.581972, No Obj: 0.494930, .5R: 0.400000, .75R: 0.200000,  count: 5
Region 23 Avg IOU: 0.398034, Class: 0.475274, Obj: 0.402009, No Obj: 0.513450, .5R: 0.200000, .75R: 0.200000,  count: 5

ghost commented 5 years ago

Hello, is there any suggestion or solution?

AlexeyAB commented 5 years ago

@Lavistas I would recommend you to train on GPU.

ghost commented 5 years ago

Thank you very much. I followed your suggestion from another issue and I increased subdivisions to 64. It runs smoothly now.

AlexeyAB / darknet

Random silent crashes after Resizing #1617