Training person detector with pascal_20_detection dataset error

arthurkafer commented 3 years ago

Describe the bug I'm trying to train a mobilenet with approx. 10k training images that were provided by PASCAL-VOC dataset, and I parsed all images that do not have the 'person' label. Unfortunately, if I try to train it, some errors show up about the integrity of the dataset, and that he couldn't open the annotation file. It happens with many annotation files, could it be something about the quantity or the file?

Screenshots This is my code, not anything special

%cd /content
# !ls images_v0/imgs_validation
import json
from axelerate import setup_training, setup_evaluation, setup_inference
import tensorflow.keras.backend as K
import traceback
import time

detector_base = {
    "model":{
        "type":                 "Detector",
        "architecture":         "MobileNet1_0", # MobileNet7_5
        "input_size":           224,
        "anchors":              [0.57273, 0.677385, 1.87446, 2.06253, 3.33843, 5.47434, 7.88282, 3.52778, 9.77052, 9.16828],
        "labels":               ["person"],
        "coord_scale" :         1.0,
        "class_scale" :         1.0,
        "object_scale" :        5.0,
        "no_object_scale" :     5.0 # 1.0e
    },
    "weights" : {
        "full":                 "",
        "backend":            "mobilenet_1_0_224_tf_no_top.h5" # 
    },
    "train" : {
        "actual_epoch":         5,
        "train_image_folder":   "geral/imgs",
        "train_annot_folder":   "geral/anns",
        "train_times":          3,
        "valid_image_folder":   "geral/imgs_validation",
        "valid_annot_folder":   "geral/anns_validation",
        "valid_times":          2,
        "valid_metric":         "mAP",
        "batch_size":           8,
        "learning_rate":        1e-4,
        "saved_folder":         "TESTE_ZERO_MEU",
        "first_trainable_layer": "", #conv_pw_13_bn
        "augumentation":        True,
        "is_only_detect" :      False
    },
    "converter" : {
        "type":                 ["k210"]
    }
}

try:
    print(json.dumps(detector_base, indent=4, sort_keys=False))
    K.clear_session()
    model_path = setup_training(config_dict=detector_base)
    K.clear_session()
    setup_evaluation(detector_base, model_path, threshold=0.5)
    print('finalizado treino final')
except Exception as e:
    traceback.print_exc()
    time.sleep(2)

This is the debug window

This image has an annotation file, but cannot be open. Check the integrity of your dataset. geral/imgs/2008_003320.jpg
  6/553 [..............................] - ETA: 15:10 - loss: 4.4850Traceback (most recent call last):
  File "<ipython-input-4-58aa7d85c5e4>", line 49, in <module>
    model_path = setup_training(config_dict=detector_base)
  File "/content/aXeleRate/axelerate/train.py", line 165, in setup_training
    return(train_from_config(config, dirname))
  File "/content/aXeleRate/axelerate/train.py", line 142, in train_from_config
    config['train']['valid_metric'])
  File "/content/aXeleRate/axelerate/networks/yolo/frontend.py", line 148, in train
    metrics="mAP")
  File "/content/aXeleRate/axelerate/networks/common_utils/fit.py", line 129, in train
    use_multiprocessing = True)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/training.py", line 1861, in fit_generator
    initial_epoch=initial_epoch)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/training.py", line 1100, in fit
    tmp_logs = self.train_function(iterator)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/def_function.py", line 828, in __call__
    result = self._call(*args, **kwds)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/def_function.py", line 855, in _call
    return self._stateless_fn(*args, **kwds)  # pylint: disable=not-callable
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/function.py", line 2943, in __call__
    filtered_flat_args, captured_inputs=graph_function.captured_inputs)  # pylint: disable=protected-access
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/function.py", line 1919, in _call_flat
    ctx, args, cancellation_manager=cancellation_manager))
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/function.py", line 560, in call
    ctx=ctx)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
    inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.UnknownError:  error: OpenCV(4.1.2) /io/opencv/modules/imgproc/src/color.cpp:182: error: (-215:Assertion failed) !_src.empty() in function 'cvtColor'

multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/lib/python3.7/multiprocessing/pool.py", line 121, in worker
    result = (True, func(*args, **kwds))
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/utils/data_utils.py", line 567, in get_index
    return _SHARED_SEQUENCES[uid][i]
  File "/content/aXeleRate/axelerate/networks/yolo/backend/batch_gen.py", line 102, in __getitem__
    img, boxes, labels = self._img_aug.imread(fname, boxes, labels)
  File "/content/aXeleRate/axelerate/networks/common_utils/augment.py", line 39, in imread
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
cv2.error: OpenCV(4.1.2) /io/opencv/modules/imgproc/src/color.cpp:182: error: (-215:Assertion failed) !_src.empty() in function 'cvtColor'

"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):

  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/script_ops.py", line 249, in __call__
    ret = func(*args)

  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/autograph/impl/api.py", line 620, in wrapper
    return func(*args, **kwargs)

  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/data/ops/dataset_ops.py", line 891, in generator_py_func
    values = next(generator_state.get_iterator(iterator_id))

  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/data_adapter.py", line 807, in wrapped_generator
    for data in generator_fn():

  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/utils/data_utils.py", line 788, in get
    six.reraise(*sys.exc_info())

  File "/usr/local/lib/python3.7/dist-packages/six.py", line 703, in reraise
    raise value

  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/utils/data_utils.py", line 779, in get
    inputs = self.queue.get(block=True, timeout=5).get()

  File "/usr/lib/python3.7/multiprocessing/pool.py", line 657, in get
    raise self._value

  File "/usr/lib/python3.7/multiprocessing/pool.py", line 121, in worker
    result = (True, func(*args, **kwds))

  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/utils/data_utils.py", line 567, in get_index
    return _SHARED_SEQUENCES[uid][i]

  File "/content/aXeleRate/axelerate/networks/yolo/backend/batch_gen.py", line 102, in __getitem__
    img, boxes, labels = self._img_aug.imread(fname, boxes, labels)

  File "/content/aXeleRate/axelerate/networks/common_utils/augment.py", line 39, in imread
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

cv2.error: OpenCV(4.1.2) /io/opencv/modules/imgproc/src/color.cpp:182: error: (-215:Assertion failed) !_src.empty() in function 'cvtColor'

     [[{{node PyFunc}}]]
     [[IteratorGetNext]] [Op:__inference_train_function_7154]

Function call stack:
train_function

Environment (please complete the following information):

Using Google Colab

Additional context Should I validate something more about the dataset images?

arthurkafer commented 3 years ago

I forgot to mention, all of the images and the annotation files can be opened fine, even those that show up errors

AIWintermuteAI commented 3 years ago

Hi there! What's the folder structure of your project and what is content of geral/anns/2008_003320.xml annotation?

arthurkafer commented 3 years ago

There's nothing special on the folder structure, images are saved on base folder 'geral', imgs, anns, imgs_validation and anns_validation.

The content of the file has nothing special too,

<annotation>
    <folder>VOC2012</folder>
    <filename>2008_003320.jpg</filename>
    <source>
        <database>The VOC2008 Database</database>
        <annotation>PASCAL VOC2008</annotation>
        <image>flickr</image>
    </source>
    <size>
        <width>500</width>
        <height>375</height>
        <depth>3</depth>
    </size>
    <segmented>0</segmented>
    <object>
        <name>motorbike</name>
        <pose>Left</pose>
        <truncated>0</truncated>
        <occluded>1</occluded>
        <bndbox>
            <xmin>80</xmin>
            <ymin>147</ymin>
            <xmax>393</xmax>
            <ymax>347</ymax>
        </bndbox>
        <difficult>0</difficult>
    </object>
    <object>
        <name>person</name>
        <pose>Left</pose>
        <truncated>0</truncated>
        <occluded>1</occluded>
        <bndbox>
            <xmin>177</xmin>
            <ymin>60</ymin>
            <xmax>296</xmax>
            <ymax>296</ymax>
        </bndbox>
        <difficult>0</difficult>
    </object>
    <object>
        <name>person</name>
        <pose>Frontal</pose>
        <truncated>1</truncated>
        <occluded>0</occluded>
        <bndbox>
            <xmin>448</xmin>
            <ymin>117</ymin>
            <xmax>483</xmax>
            <ymax>168</ymax>
        </bndbox>
        <difficult>0</difficult>
    </object>
    <object>
        <name>pottedplant</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <occluded>0</occluded>
        <bndbox>
            <xmin>426</xmin>
            <ymin>162</ymin>
            <xmax>500</xmax>
            <ymax>276</ymax>
        </bndbox>
        <difficult>0</difficult>
    </object>
    <object>
        <name>pottedplant</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <occluded>1</occluded>
        <bndbox>
            <xmin>79</xmin>
            <ymin>155</ymin>
            <xmax>143</xmax>
            <ymax>274</ymax>
        </bndbox>
        <difficult>0</difficult>
    </object>
</annotation>

arthurkafer commented 3 years ago

Reading the axelerate/networks/common_utils/augment.py file, I see that the problem is actually with the image file, not with the annotation file. That is weirder, because in my script I open all the files just to test if there are all annotation files for all images. I tested, and the problem seems to happen in the image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) fuction, that 90% of the images in my dataset didn't work.

Looking at the images, I saw that they actually weren't on the Google Colab ambient. I used to input them manually by my local computer, but I'll have to host it somewhere to download all of them correctly. I'll do it and close the issue if I can make it work that way.

AIWintermuteAI commented 3 years ago

Well, actually, if I'm not mistaken cv2.cvtColor raises this exception if the image array is empty, so no image was open. imread() on not existing image doesn't raise exception. https://docs.opencv.org/master/db/deb/tutorial_display_image.html You can see here it returns None if image cannot be read - for whatever reason. This is why I put both imread and cvtColor in the try-except block. In hindsight, I should have just check if image is None...

Nevertheless it still does seem like an image path problem. If you are not able to debug it yourself, can you share Colab notebook and dateset with me privately? E.g. DM on LinkedIn or Twitter.

arthurkafer commented 3 years ago

Yes, you're right, just cv2.cvtColor raises the exception, I was doing the validation wrongly. But it's ok, I solved the image path and downloading problem, its training fine right now

Thanks for your time and work!

arthurkafer commented 3 years ago

Hey, bringing this issue back again.

I've switched the way that I import my dataset, always uploading it to Google Drive and downloaded it inside Google Colab. The error that is happening is the same that happened before:

And then, I went to check my dataset if those images are really on my dataset or if they aren't. But then I realized something, that image name geral/imgs/nvcamtest_21284_s00_00000.jpg is the old filename of the image, the new one is frontalimage37.jpg. The annotation file is frontalimage37.xml too, and the annotation file nvcamtest_21284 does not exist on the dataset folder.

I renamed all of the images by using os.rename(old_filename, new_filename), I can send you the Colab link if you could help me to debug it.

Could it be something about renaming those files?

Thanks in advance

AIWintermuteAI commented 3 years ago

Hi, Arthur! So, if you look inside of one of the annotation files, you'll see something similar to

<annotation verified="yes">
    <folder>Mark</folder>
    <filename>IMG_20191130_115225_BURST20.jpg</filename>
    <path>/home/ubuntu/Documents/Mark/IMG_20191130_115225_BURST20.jpg</path>
    <source>
        <database>Unknown</database>
    </source>
    <size>
        <width>4000</width>
        <height>3000</height>
        <depth>3</depth>
    </size>
    <segmented>0</segmented>
    <object>
        <name>mark</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>1105</xmin>
            <ymin>574</ymin>
            <xmax>3343</xmax>
            <ymax>2351</ymax>
        </bndbox>
    </object>
</annotation>

This is standard PASCAL-VOC object annotation format - you see there is filename field here. During the training aXeleRate takes the filename and joins it with image folder path. Someone made PR to aXeleRate, that reads a path field value if it is present in the annotation. I haven't reviewed it yet.

So, to summarize, you can't just rename files, the filenames inside of annotations need to be changed too.

arthurkafer commented 3 years ago

Hi, thanks for the response.

I'll change my dataset

AIWintermuteAI / aXeleRate

Training person detector with pascal_20_detection dataset error #48