Training - Githubissues

Hola Alex,

Al entrenar la red neuronal, por terminal me sale lo siguiente:

(deteccionobj) C:\Users\Master\Documents\deteccion-objetos-video-master>python train.py --model_def config/yolov3-custom.cfg --data_config config/custom.data --pretrained_weights weights/darknet53.conv.74 --batch_size 2 Namespace(batch_size=2, checkpoint_interval=1, compute_map=False, data_config='config/custom.data', epochs=100, evaluation_interval=1, gradient_accumulations=2, img_size=416, model_def='config/yolov3-custom.cfg', multiscale_training=True, n_cpu=8, pretrained_weights='weights/darknet53.conv.74') 2020-11-24 13:04:50.809085: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2 Traceback (most recent call last): File "train.py", line 99, in for batchi, (, imgs, targets) in enumerate(dataloader): File "C:\Users\Master\miniconda3\envs\deteccionobj\lib\site-packages\torch\utils\data\dataloader.py", line 435, in next data = self._next_data() File "C:\Users\Master\miniconda3\envs\deteccionobj\lib\site-packages\torch\utils\data\dataloader.py", line 1085, in _next_data return self._process_data(data) File "C:\Users\Master\miniconda3\envs\deteccionobj\lib\site-packages\torch\utils\data\dataloader.py", line 1111, in _process_data data.reraise() File "C:\Users\Master\miniconda3\envs\deteccionobj\lib\site-packages\torch_utils.py", line 428, in reraise raise self.exc_type(msg) RuntimeError: Caught RuntimeError in DataLoader worker process 0. Original Traceback (most recent call last): File "C:\Users\Master\miniconda3\envs\deteccionobj\lib\site-packages\torch\utils\data_utils\worker.py", line 198, in _worker_loop data = fetcher.fetch(index) File "C:\Users\Master\miniconda3\envs\deteccionobj\lib\site-packages\torch\utils\data_utils\fetch.py", line 47, in fetch return self.collate_fn(data) File "C:\Users\Master\Documents\deteccion-objetos-video-master\utils\datasets.py", line 141, in collate_fn targets = torch.cat(targets, 0) RuntimeError: There were no tensor arguments to this function (e.g., you passed an empty list of Tensors), but no fallback function is registered for schema aten::_cat. This usually means that this function requires a non-empty list of Tensors. Available functions are [CPU, QuantizedCPU, BackendSelect, Named, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, Tracer, Autocast, Batched, VmapMode].

CPU: registered at aten\src\ATen\CPUType.cpp:2127 [kernel] QuantizedCPU: registered at aten\src\ATen\QuantizedCPUType.cpp:297 [kernel] BackendSelect: fallthrough registered at ..\aten\src\ATen\core\BackendSelectFallbackKernel.cpp:3 [backend fallback] Named: registered at ..\aten\src\ATen\core\NamedRegistrations.cpp:7 [backend fallback] AutogradOther: registered at ..\torch\csrc\autograd\generated\VariableType_2.cpp:8078 [autograd kernel] AutogradCPU: registered at ..\torch\csrc\autograd\generated\VariableType_2.cpp:8078 [autograd kernel] AutogradCUDA: registered at ..\torch\csrc\autograd\generated\VariableType_2.cpp:8078 [autograd kernel] AutogradXLA: registered at ..\torch\csrc\autograd\generated\VariableType_2.cpp:8078 [autograd kernel] AutogradPrivateUse1: registered at ..\torch\csrc\autograd\generated\VariableType_2.cpp:8078 [autograd kernel] AutogradPrivateUse2: registered at ..\torch\csrc\autograd\generated\VariableType_2.cpp:8078 [autograd kernel] AutogradPrivateUse3: registered at ..\torch\csrc\autograd\generated\VariableType_2.cpp:8078 [autograd kernel] Tracer: registered at ..\torch\csrc\autograd\generated\TraceType_2.cpp:9654 [kernel] Autocast: registered at ..\aten\src\ATen\autocast_mode.cpp:258 [kernel] Batched: registered at ..\aten\src\ATen\BatchingRegistrations.cpp:511 [backend fallback] VmapMode: fallthrough registered at ..\aten\src\ATen\VmapModeRegistrations.cpp:33 [backend fallback]

Y no se me genera ningun archivo de checkpoint ni nada, sabes a que puede ser debido?

Muchas gracias.

Hola! tengo el mismo error que comenta Jorge. en mi caso hace el primer batch pero no continua entrenado. no he podido encontrar una solución...

@puigalex

este es mi error:

`/home/nando/anaconda3/envs/deteccionobj/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) /home/nando/anaconda3/envs/deteccionobj/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) /home/nando/anaconda3/envs/deteccionobj/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:528: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) /home/nando/anaconda3/envs/deteccionobj/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:529: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) /home/nando/anaconda3/envs/deteccionobj/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:530: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) /home/nando/anaconda3/envs/deteccionobj/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:535: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) Namespace(batch_size=2, checkpoint_interval=1, compute_map=False, data_config='config/custom.data', epochs=2, evaluation_interval=1, gradient_accumulations=1, img_size=416, model_def='config/yolov3-custom.cfg', multiscale_training=True, n_cpu=1, pretrained_weights='weights/darknet53.conv.74')

---- [Epoch 0/2, Batch 0/9] ---- +------------+--------------+--------------+--------------+ | Metrics | YOLO Layer 0 | YOLO Layer 1 | YOLO Layer 2 | +------------+--------------+--------------+--------------+ | grid_size | 14 | 28 | 56 | | loss | 74.792717 | 81.729988 | 84.118752 | | x | 0.114787 | 0.176580 | 0.162244 | | y | 0.320508 | 0.251776 | 0.156272 | | w | 0.128963 | 2.934671 | 6.220413 | | h | 0.033253 | 1.888722 | 7.659881 | | conf | 73.707291 | 75.593887 | 69.238266 | | cls | 0.487916 | 0.884351 | 0.681678 | | cls_acc | 100.00% | 100.00% | 100.00% | | recall50 | 0.500000 | 0.000000 | 0.000000 | | recall75 | 0.500000 | 0.000000 | 0.000000 | | precision | 0.001534 | 0.000000 | 0.000000 | | conf_obj | 0.478465 | 0.506171 | 0.449425 | | conf_noobj | 0.506021 | 0.522323 | 0.492824 | +------------+--------------+--------------+--------------+ Total loss 240.64144897460938 ---- ETA 0:01:44.239290 Traceback (most recent call last): File "train.py", line 99, in for batchi, (, imgs, targets) in enumerate(dataloader): File "/home/nando/anaconda3/envs/deteccionobj/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 345, in next data = self._next_data() File "/home/nando/anaconda3/envs/deteccionobj/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 856, in _next_data return self._process_data(data) File "/home/nando/anaconda3/envs/deteccionobj/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 881, in _process_data data.reraise() File "/home/nando/anaconda3/envs/deteccionobj/lib/python3.6/site-packages/torch/_utils.py", line 395, in reraise raise self.exc_type(msg) TypeError: Caught TypeError in DataLoader worker process 0. Original Traceback (most recent call last): File "/home/nando/anaconda3/envs/deteccionobj/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop data = fetcher.fetch(index) File "/home/nando/anaconda3/envs/deteccionobj/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/nando/anaconda3/envs/deteccionobj/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/nando/YOLOv3/deteccion-objetos-video/utils/datasets.py", line 130, in getitem img, targets = horisontal_flip(img, targets) File "/home/nando/YOLOv3/deteccion-objetos-video/utils/augmentations.py", line 8, in horisontal_flip targets[:, 2] = 1 - targets[:, 2] TypeError: 'NoneType' object is not subscriptable `

Gracias!

Tengo el mismo error, ¿pudisteis solucionarlo? Gracias

Hola Alberto, no al final no di con la solucion.

Tengo el mismo error

@puigalex

este es mi error:

`/home/nando/anaconda3/envs/deteccionobj/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) /home/nando/anaconda3/envs/deteccionobj/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) /home/nando/anaconda3/envs/deteccionobj/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:528: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) /home/nando/anaconda3/envs/deteccionobj/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:529: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) /home/nando/anaconda3/envs/deteccionobj/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:530: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) /home/nando/anaconda3/envs/deteccionobj/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:535: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) Namespace(batch_size=2, checkpoint_interval=1, compute_map=False, data_config='config/custom.data', epochs=2, evaluation_interval=1, gradient_accumulations=1, img_size=416, model_def='config/yolov3-custom.cfg', multiscale_training=True, n_cpu=1, pretrained_weights='weights/darknet53.conv.74')

---- [Epoch 0/2, Batch 0/9] ---- +------------+--------------+--------------+--------------+ | Metrics | YOLO Layer 0 | YOLO Layer 1 | YOLO Layer 2 | +------------+--------------+--------------+--------------+ | grid_size | 14 | 28 | 56 | | loss | 74.792717 | 81.729988 | 84.118752 | | x | 0.114787 | 0.176580 | 0.162244 | | y | 0.320508 | 0.251776 | 0.156272 | | w | 0.128963 | 2.934671 | 6.220413 | | h | 0.033253 | 1.888722 | 7.659881 | | conf | 73.707291 | 75.593887 | 69.238266 | | cls | 0.487916 | 0.884351 | 0.681678 | | cls_acc | 100.00% | 100.00% | 100.00% | | recall50 | 0.500000 | 0.000000 | 0.000000 | | recall75 | 0.500000 | 0.000000 | 0.000000 | | precision | 0.001534 | 0.000000 | 0.000000 | | conf_obj | 0.478465 | 0.506171 | 0.449425 | | conf_noobj | 0.506021 | 0.522323 | 0.492824 | +------------+--------------+--------------+--------------+ Total loss 240.64144897460938 ---- ETA 0:01:44.239290 Traceback (most recent call last): File "train.py", line 99, in for batchi, (, imgs, targets) in enumerate(dataloader): File "/home/nando/anaconda3/envs/deteccionobj/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 345, in next data = self._next_data() File "/home/nando/anaconda3/envs/deteccionobj/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 856, in _next_data return self._process_data(data) File "/home/nando/anaconda3/envs/deteccionobj/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 881, in _process_data data.reraise() File "/home/nando/anaconda3/envs/deteccionobj/lib/python3.6/site-packages/torch/_utils.py", line 395, in reraise raise self.exc_type(msg) TypeError: Caught TypeError in DataLoader worker process 0. Original Traceback (most recent call last): File "/home/nando/anaconda3/envs/deteccionobj/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop data = fetcher.fetch(index) File "/home/nando/anaconda3/envs/deteccionobj/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/nando/anaconda3/envs/deteccionobj/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/nando/YOLOv3/deteccion-objetos-video/utils/datasets.py", line 130, in getitem img, targets = horisontal_flip(img, targets) File "/home/nando/YOLOv3/deteccion-objetos-video/utils/augmentations.py", line 8, in horisontal_flip targets[:, 2] = 1 - targets[:, 2] TypeError: 'NoneType' object is not subscriptable `

Gracias!

Yo también tuve ese error, y fue porque algunas imágenes no tenían su archivo .txt, o que el .txt tuviera el nombre diferente al de la imágen a la que pertenece

puigalex / deteccion-objetos-video

Training #16