Entrenar YOLO con el dataset aplicando data augmentation

alvarag commented 7 years ago

He parado la ejecución porque el error del NAN se mantenía. Cuando tengamos claro que está resuelto ese problema, lo volvemos a ejecutar:

python3 train.py 
Parsing ./cfg/yolo.cfg
Parsing cfg/yolo.cfg
Loading yolo.weights ...
Successfully identified 203934260 bytes
Finished in 0.02673625946044922s
Model has a coco model name, loading coco labels.

Building net ...
Source | Train? | Layer description                | Output size
-------+--------+----------------------------------+---------------
       |        | input                            | (?, 608, 608, 3)
 Load  |  Yep!  | conv 3x3p1_1  +bnorm  leaky      | (?, 608, 608, 32)
 Load  |  Yep!  | maxp 2x2p0_2                     | (?, 304, 304, 32)
 Load  |  Yep!  | conv 3x3p1_1  +bnorm  leaky      | (?, 304, 304, 64)
 Load  |  Yep!  | maxp 2x2p0_2                     | (?, 152, 152, 64)
 Load  |  Yep!  | conv 3x3p1_1  +bnorm  leaky      | (?, 152, 152, 128)
 Load  |  Yep!  | conv 1x1p0_1  +bnorm  leaky      | (?, 152, 152, 64)
 Load  |  Yep!  | conv 3x3p1_1  +bnorm  leaky      | (?, 152, 152, 128)
 Load  |  Yep!  | maxp 2x2p0_2                     | (?, 76, 76, 128)
 Load  |  Yep!  | conv 3x3p1_1  +bnorm  leaky      | (?, 76, 76, 256)
 Load  |  Yep!  | conv 1x1p0_1  +bnorm  leaky      | (?, 76, 76, 128)
 Load  |  Yep!  | conv 3x3p1_1  +bnorm  leaky      | (?, 76, 76, 256)
 Load  |  Yep!  | maxp 2x2p0_2                     | (?, 38, 38, 256)
 Load  |  Yep!  | conv 3x3p1_1  +bnorm  leaky      | (?, 38, 38, 512)
 Load  |  Yep!  | conv 1x1p0_1  +bnorm  leaky      | (?, 38, 38, 256)
 Load  |  Yep!  | conv 3x3p1_1  +bnorm  leaky      | (?, 38, 38, 512)
 Load  |  Yep!  | conv 1x1p0_1  +bnorm  leaky      | (?, 38, 38, 256)
 Load  |  Yep!  | conv 3x3p1_1  +bnorm  leaky      | (?, 38, 38, 512)
 Load  |  Yep!  | maxp 2x2p0_2                     | (?, 19, 19, 512)
 Load  |  Yep!  | conv 3x3p1_1  +bnorm  leaky      | (?, 19, 19, 1024)
 Load  |  Yep!  | conv 1x1p0_1  +bnorm  leaky      | (?, 19, 19, 512)
 Load  |  Yep!  | conv 3x3p1_1  +bnorm  leaky      | (?, 19, 19, 1024)
 Load  |  Yep!  | conv 1x1p0_1  +bnorm  leaky      | (?, 19, 19, 512)
 Load  |  Yep!  | conv 3x3p1_1  +bnorm  leaky      | (?, 19, 19, 1024)
 Load  |  Yep!  | conv 3x3p1_1  +bnorm  leaky      | (?, 19, 19, 1024)
 Load  |  Yep!  | conv 3x3p1_1  +bnorm  leaky      | (?, 19, 19, 1024)
 Load  |  Yep!  | concat [16]                      | (?, 38, 38, 512)
 Load  |  Yep!  | conv 1x1p0_1  +bnorm  leaky      | (?, 38, 38, 64)
 Load  |  Yep!  | local flatten 2x2                | (?, 19, 19, 256)
 Load  |  Yep!  | concat [27, 24]                  | (?, 19, 19, 1280)
 Load  |  Yep!  | conv 3x3p1_1  +bnorm  leaky      | (?, 19, 19, 1024)
 Load  |  Yep!  | conv 1x1p0_1    linear           | (?, 19, 19, 425)
-------+--------+----------------------------------+---------------
Running entirely on CPU
cfg/yolo.cfg loss hyper-parameters:
    H       = 19
    W       = 19
    box     = 5
    classes = 80
    scales  = [1.0, 5.0, 1.0, 1.0]
Building cfg/yolo.cfg loss
Building cfg/yolo.cfg train op
2017-05-22 19:41:24.255888: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-05-22 19:41:24.255935: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-05-22 19:41:24.255943: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-05-22 19:41:24.255950: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-05-22 19:41:24.255956: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
Finished in 7.35931396484375s

cfg/yolo.cfg parsing ./images/Default/
Parsing for ['Fitolito', 'bicycle', 'car', 'motorbike', 'aeroplane', 'bus', 'train', 'truck', 'boat', 'traffic light', 'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'sofa', 'pottedplant', 'bed', 'diningtable', 'toilet', 'tvmonitor', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush'] 
[====================>]100%  2017_5_17_18_3Image_489.json
Statistics:
Fitolito: 258
Dataset size: 188
Dataset of 188 instance(s)
Training statistics: 
    Learning rate : 1e-05
    Batch size    : 10
    Epoch number  : 50
    Backup every  : 1000
step 1 - loss 3.5565757751464844 - moving ave loss 3.5565757751464844
step 2 - loss 4.204679012298584 - moving ave loss 3.621386098861694
step 3 - loss 3.7044780254364014 - moving ave loss 3.629695291519165
step 4 - loss 3.893934965133667 - moving ave loss 3.656119258880615
step 5 - loss 4.612575054168701 - moving ave loss 3.751764838409424
step 6 - loss 4.304464817047119 - moving ave loss 3.807034836273194
step 7 - loss 4.00748348236084 - moving ave loss 3.8270797008819586
step 8 - loss 4.337255954742432 - moving ave loss 3.878097326268006
step 9 - loss nan - moving ave loss nan
step 10 - loss nan - moving ave loss nan
step 11 - loss nan - moving ave loss nan
step 12 - loss nan - moving ave loss nan
step 13 - loss nan - moving ave loss nan
step 14 - loss nan - moving ave loss nan
step 15 - loss nan - moving ave loss nan
step 16 - loss nan - moving ave loss nan
step 17 - loss nan - moving ave loss nan

jasag commented 7 years ago

Vale. Al menos da la sensación de que el error no es la conversión de las coordenadas al formato de YOLO.

jasag commented 7 years ago

Por cierto, ¿Cuántas imágenes de data augmentation debería generar por imagen real? Por ahora generaré unas 20 imágenes por imagen real.

alvarag commented 7 years ago

Si, puedes empezar con unas 20/imagen, o incluso subir a 40 ó 50. Es mejor tener muchas que pocas, aunque no les metas mucha distorsión, para evitar tener mucho ruido.

jasag commented 7 years ago

He abierto una issue en el repositorio de darkflow para cercionarme de como se utilizan los pesos obtenidos del entrenamiento. Puesto que les estaba utilizando mal.

jasag commented 7 years ago

Debido a la dificultad, por los recursos necesitados, para entrenar el modelo con todas las imágenes, cierro esta issue. Al menos hasta dar con una posible solución.

jasag / Phytoliths-recognition-system

Entrenar YOLO con el dataset aplicando data augmentation #63