Closed ikerodl96 closed 5 years ago
which kind of dataset are you using? it is csv or xml? This error occurs during the validation? your dataset is split between training and validation?
Hello @rodrigo2019, thank you so much for your fast reply.
The annotations of my dataset are specified by means of an xml (VOC format), my dataset is split into train and validation (80%-20%, I only provide both the images and annotations for the training set) and the error happens during the training. I attach an example:
Epoch 1/150
1/5 [=====>........................] - ETA: 25s - loss: 242.3871
2/5 [===========>..................] - ETA: 15s - loss: 240.8282
3/5 [=================>............] - ETA: 9s - loss: 231.9124
4/5 [=======================>......] - ETA: 4s - loss: 228.4962Traceback (most recent call last):
File "C:/Users/iotxoa/Desktop/keras_yolov2_proyecto/train.py", line 127, in
Surfing the web, I have found that maybe its related to the data generator. Some people say that there should be an infinite loop (while True:, while 1:) inside the corresponding function in the preprocessing.py file. For example: stackoverflow.
I would appreciate any kind of help because I would like to use this Yolo implementation for a university project and I am in a hurry.
Many thanks in advance.
Looks like you have some bad samples
put print(self._images[i]['filename'])
before this line
after that, check your annotation corresponding for this file
probably in some annotation is missing xmin
value
Hello @rodrigo2019
I have just proved what you ahve suggested to me but there is no filename printing in the console. The usual traceback appears... but no more.
Anyway, I have manually revised all the xml file annotations for each of the images and all the
Many thanks for the fast answers and checks that you are giving to me. I really appreciate your help.
I have just proved what you ahve suggested to me but there is no filename printing in the console. The usual traceback appears... but no more.
Try to write it in a file. Use try/except in the except condition you can write the file name
Hello again @rodrigo2019,
I have done what you have mentioned but no file is generated inside the except. This is very strange and time consuming...
Hi @rodrigo2019,
Finally I found the error. Among the images and annotations there was one which accidentally had a label in polygon format, something which is wrong since all of them should be rectangles (bounding boxes) as I specified when creating a labeling template in labelbox.
I have manually fixed that and now the training process seems to work. We will see the performace... Hope it is not too bad...
Thank you @rodrigo2019 for all your rapid replies and for sharing this nice work.
Best regards
Hello @rodrigo2019,
First of all, thank you for sharing this nice work. This is the most feasible, nice and complete YOLO v2 implementation in Keras that I have found. In my case, I am having a problem when training with my own small dataset (less than 30 samples). I know that there is nothing to learn with such a small number of samples but I will get more samples as soon as they are labeled. The purpose by now was to just try if all the code was working for my particular configuration and dataset. I think that the problem is related to the datagenerator. By simply looking at the traceback, can you guess which is the problem? I would appreciate it.
Here the traceback:
Using TensorFlow backend. 2019-06-06 22:27:45.839343: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2300000000 Hz 2019-06-06 22:27:45.839619: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x24571e0 executing computations on platform Host. Devices: 2019-06-06 22:27:45.839672: I tensorflow/compiler/xla/service/service.cc:158] StreamExecutor device (0):,
2019-06-06 22:27:46.014073: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-06-06 22:27:46.014607: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x2456dc0 executing computations on platform CUDA. Devices:
2019-06-06 22:27:46.014662: I tensorflow/compiler/xla/service/service.cc:158] StreamExecutor device (0): Tesla T4, Compute Capability 7.5
2019-06-06 22:27:46.015072: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties:
name: Tesla T4 major: 7 minor: 5 memoryClockRate(GHz): 1.59
pciBusID: 0000:00:04.0
totalMemory: 14.73GiB freeMemory: 14.60GiB
2019-06-06 22:27:46.015105: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2019-06-06 22:27:46.490548: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-06-06 22:27:46.490609: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0
2019-06-06 22:27:46.490620: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N
2019-06-06 22:27:46.491015: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14115 MB memory) -> physical GPU (device: 0, name: Tesla T4, pci bus id: 0000:00:04.0, compute capability: 7.5)
100% 13/13 [00:00<00:00, 1985.87it/s]
Seen labels: {'ref209': 13, 'ref209_1': 13, 'ref209_3': 13, 'ref209_4': 13, 'ref209_5': 13, 'ref209_6': 13, 'ref209_7': 13, 'ref210_1': 13, 'tool209_1': 13, 'tool209_2': 13}
Given labels: ['ref209', 'ref209_1', 'ref209_3', 'ref209_4', 'ref209_5', 'ref209_6', 'ref209_7', 'ref210_1', 'tool209_1', 'tool209_2']
Overlap labels: {'ref209_3', 'ref209_5', 'ref209_6', 'ref210_1', 'ref209_1', 'tool209_2', 'ref209', 'tool209_1', 'ref209_7', 'ref209_4'}
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
Loading pretrained weights: ./backend_weights/full_yolo_backend.h5
(13, 13)
Layer (type) Output Shape Param #
input_1 (InputLayer) (None, 416, 416, 3) 0
Full_YOLO_backend (Model) (None, 13, 13, 1024) 50547936
Detection_layer (Conv2D) (None, 13, 13, 75) 76875
YOLO_output (Reshape) (None, 13, 13, 5, 15) 0
Total params: 50,624,811 Trainable params: 50,604,139 Non-trainable params: 20,672
WARNING:tensorflow:From /content/keras_yolov2_proyecto/keras_yolov2/yolo_loss.py:73: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.cast instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.cast instead. Epoch 1/8 Traceback (most recent call last): File "train.py", line 127, in
main()
File "train.py", line 123, in main
score_threshold=config['valid']['score_threshold'])
File "/content/keras_yolov2_proyecto/keras_yolov2/frontend.py", line 210, in train
max_queue_size=max_queue_size)
File "/usr/local/lib/python3.6/dist-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, kwargs)
File "/usr/local/lib/python3.6/dist-packages/keras/engine/training.py", line 1418, in fit_generator
initial_epoch=initial_epoch)
File "/usr/local/lib/python3.6/dist-packages/keras/engine/training_generator.py", line 181, in fit_generator
generator_output = next(output_generator)
File "/usr/local/lib/python3.6/dist-packages/keras/utils/data_utils.py", line 601, in get
six.reraise(sys.exc_info())
File "/usr/local/lib/python3.6/dist-packages/six.py", line 693, in reraise
raise value
File "/usr/local/lib/python3.6/dist-packages/keras/utils/data_utils.py", line 595, in get
inputs = self.queue.get(block=True).get()
File "/usr/lib/python3.6/multiprocessing/pool.py", line 670, in get
raise self._value
File "/usr/lib/python3.6/multiprocessing/pool.py", line 119, in worker
result = (True, func(args, kwds))
File "/usr/local/lib/python3.6/dist-packages/keras/utils/data_utils.py", line 401, in get_index
return _SHARED_SEQUENCES[uid][i]
File "/content/keras_yolov2_proyecto/keras_yolov2/preprocessing.py", line 250, in getitem
img, all_objs = self.aug_image(train_instance, jitter=self._jitter)
File "/content/keras_yolov2_proyecto/keras_yolov2/preprocessing.py", line 359, in aug_image
obj[attr] = int(obj[attr] * scale - offx)
KeyError: 'xmin'