OlafenwaMoses / ImageAI

A python library built to empower developers to build applications and systems with self-contained Computer Vision capabilities
https://www.genxr.co/#products
MIT License
8.49k stars 2.18k forks source link

Training mAP = 0.00 Help needed #326

Open acleitao76 opened 4 years ago

acleitao76 commented 4 years ago

As I'm not getting anywhere with this I will ask you guys for help again. I ran a model train for 50 epocs using the code bellow. When I try to evaluate I get 0 on all epocs. What am I doing wrong? My project gave a turn and changed concept so now I'm trying to detect bibnumber(marathon number plates) I got ~600 pictures of runners on each I annotated (made a square over the plate) using Imglabel as described. Upload to the folder bibnumber/train/img or annotation and bibnumber/validation/annotation. Installed tensorflow 1.13 and upgrade imageai (running on gcp notebook with gpu)

Everything seems to be normal this is my code for training:

`from imageai.Detection.Custom import DetectionModelTrainer

trainer = DetectionModelTrainer() trainer.setModelTypeAsYOLOv3() trainer.setDataDirectory(data_directory="bibnumber") trainer.setTrainConfig(object_names_array=["bibnumber"], batch_size=4, num_experiments=50, train_from_pretrained_model="pretrained-model.h5") trainer.trainModel()`

Got this output with 50 different loss for each epoc:

Using TensorFlow backend. /home/jupyter/.local/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) /home/jupyter/.local/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) /home/jupyter/.local/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:528: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) /home/jupyter/.local/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:529: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) /home/jupyter/.local/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:530: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) /home/jupyter/.local/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:535: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) Generating anchor boxes for training images and annotation... not well-formed (invalid token): line 1, column 0 Ignore this bad annotation: img/train/annotations/.DS_Store Average IOU for 9 anchors: 0.84 Anchor Boxes generated. Detection configuration saved in img/json/detection_config.json Training on: ['bibnumber'] Training with Batch Size: 4 Number of Experiments: 50 WARNING:tensorflow:From /home/jupyter/.local/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version. Instructions for updating: Colocations handled automatically by placer. WARNING:tensorflow:From /home/jupyter/.local/lib/python3.5/site-packages/imageai/Detection/Custom/yolo.py:24: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.cast instead. Training with transfer learning from pretrained Model /usr/local/lib/python3.5/dist-packages/keras/callbacks.py:1065: UserWarning:epsilonargument is deprecated and will be removed, usemin_deltainstead. warnings.warn('epsilon` argument is deprecated and ' WARNING:tensorflow:From /home/jupyter/.local/lib/python3.5/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.cast instead. Epoch 1/50

this is my code for the evaluation part `from imageai.Detection.Custom import DetectionModelTrainer

trainer = DetectionModelTrainer() trainer.setModelTypeAsYOLOv3() trainer.setDataDirectory(data_directory="bibnumber") trainer.evaluateModel(model_path="bibnumber/models", json_path="img/json/detection_config.json", iou_threshold=0.5, object_threshold=0.3, nms_threshold=0.5)`

got exactly the same output for every epoc in the list `Starting Model evaluation.... /usr/local/lib/python3.5/dist-packages/keras/engine/saving.py:292: UserWarning: No training configuration found in save file: the model was not compiled. Compile it manually. warnings.warn('No training configuration found in save file: ' Model File: img/models/detection_model-ex-40--loss-7.47.h5

Using IoU : 0.5 Using Object Threshold : 0.3 Using Non-Maximum Suppression : 0.5 bibnumber: 0.0000 mAP: 0.0000

Model File: img/models/detection_model-ex-18--loss-11.13.h5 `

I thought it could be some problem in the library (it's not) I decided to download the model with less loss, the json and try it: `from imageai.Detection.Custom import CustomObjectDetection

detector = CustomObjectDetection() detector.setModelTypeAsYOLOv3() detector.setModelPath("detection_model-ex-50--loss-5.33.h5") detector.setJsonPath("/detection_config.json") detector.loadModel() detections = detector.detectObjectsFromImage(input_image="Move_for_cancer7861.JPG", output_image_path="Move_for_cancer7861fix.JPG") for detection in detections: print(detection["name"], " : ", detection["percentage_probability"], " : ", detection["box_points"])`

All I get back it's a empty list. I really don't know what am I doing wrong can you please point me in the right direction? another thing... few very newbie question. What is the difference over a detection and a prediction? I saw there is a way to train a model from scratch like not use a pretrained model. It's this applicable for my scenario? is it better or worst some how?

Thank you guys. Sorry for flooding the post. I tried to fix the format of the post but looks like some characters in the code are messing with it

rola93 commented 4 years ago

hi

Will try to help you.

in your very first piece of code, when you write: train_from_pretrained_model="pretrained-model.h5")

where did you take pretrained-model.h5 file?

with respect to your last question (a more "conceptual" one):

What is the difference over a detection and a prediction? I saw there is a way to train a model from scratch like not use a pretrained model. It's this applicable for my scenario? is it better or worst some how?

ImageAI makes a distinction between those two tasks: detect and predict. It's not the standard way to distinguish between them, and is a bit conffusing for me to name those in this way.

detection refers to the so called object detection task: that is, to say where an object is located on a image, which may contain several objects and therefore we would like to know where is each of those objects and what is it.

prediction refer to the well known problem of image classification. You can think this as saying which is the main object in image. But it also works when you have different kind/categories of images and you would like to know, given an image, what category does it belongs to. For instance: does this image belongs to "cats" category or "dogs"? or maybe you want to know if the image is in the set of useful images or not useful (according to some criteria of usefulness).

Consider that the first problem is far harder than the second, so my advise is to try always to face your problem as a classification problem. In addition to this, as the first problem is harder, involves to train bigger models, and therefore you may need more data to train them from scratch, and more computational power to train them.

checkout this small article which resumes and compares those task very well.

acleitao76 commented 4 years ago

Hi there @rola93 Thx for your help I will read the article as soon as I get out for lunch. I followed this article

more specifically download it from here

this is my type of image sometimes there will be more persons so one annotation per person in the same file. Used LabelImg to annotate them in pascal voc format and they look like this(one file per image) `

<folder>img</folder>

    <filename>Move_for_cancer13834.JPG</filename>

<path>/corridas/img/Move_for_cancer13834.JPG</path>

<source>

    <database>Unknown</database>

</source>

<size>

    <width>5184</width>

    <height>3456</height>

    <depth>3</depth>

</size>

<segmented>0</segmented>

<object>

    <name>bibnumber</name>

    <pose>Unspecified</pose>

    <truncated>0</truncated>

    <difficult>0</difficult>

    <bndbox>

        <xmin>3212</xmin>

        <ymin>1133</ymin>

        <xmax>3637</xmax>

        <ymax>1678</ymax>

    </bndbox>

</object>

` I would be really glad if you can put me on the right direction. What is more wierd is that as sson as I started doing this I trained a model with 50 images fro 5 cycles and it work (only for one race not for all after all I didn't have too many images for training by then)

thx in advance

acleitao76 commented 4 years ago

I was thinking... after reading with attention... is the problem the path on my annotations files?

acleitao76 commented 4 years ago

Ok just changed the paths... same result I really don't know what is wrong. @rola93 thanks for the article It helped me a lot to understand the differences. I think I will have to start from scratch again. New dataset... new annotations everything

rola93 commented 4 years ago

this is my type of image sometimes there will be more persons so one annotation per person in the same file. Used LabelImg to annotate them in pascal voc format and they look like this(one file per image)

I think your approach is correct. But what you should do is a first stage in which you try just to recognize the white surface with the number, with a object detection approach, and then try to extract the number with OCR/image classification or some other technique.

Just to clarify, my point is: do not try to extract the white surface AND its number on a single stage with object detection

OlafenwaMoses commented 4 years ago

@acleitao76 Although a custom detection model might work for a OCR based task,

So, as @rola93 as suggested, it is better you detect the overall bib numbers and then apply an OCR based model to extract the number values, using a 2 stage process.

acleitao76 commented 4 years ago

@OlafenwaMoses @rola93 that's just the case. I'm just trying to find the tag then I cut it off and apply a ocr(now google soon tesseract) the problem is I'm not getting high accuracy finding and cutting of the tag. if the tag just change(every race has different colors) it already don't find it. This weekend I will get my images and label them again to try to train. I'm still getting mAP 0

rola93 commented 4 years ago

Oh hoy it.

Yes, if they are different each time you will need either train one model with all of those variations (or as much as possible) or one model per race.

If the amount of different samples is too big you could also train a big model with as many of those as possible and then fine tune a version of it for each race.

I don't know your use case :/