AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.76k stars 7.96k forks source link

Not detecting number plate though average loss is less than 0.06 #2048

Closed NamburiSrinath closed 5 years ago

NamburiSrinath commented 5 years ago

Hi, I am trying to detect number plate region given a car. The detailed problem has been described here(in the link) (as it is too lengthy). Please look into it and any help is very much appreciated.

Between my images are around 100*32(different images, different sizes) having only number plates and I didn't do any resizing. I think YOLO will resize that. Is it true??

Link attached here for detailed description of problem.

https://medium.com/@namburisrinath/hey-hi-80660945331b

I have followed https://medium.com/@manivannan_data/how-to-train-yolov2-to-detect-custom-objects-9010df784f36 to train data

If any further clarifications/doubts are there, please do ask.

Thanks in advance Srinath

AlexeyAB commented 5 years ago

@NamburiSrinath Hi,

I think YOLO will resize that. Is it true??

Yes.


I think, the main mistake is that you use Training images which are not similar to the Detection images. You brake this rule: https://github.com/AlexeyAB/darknet#how-to-improve-object-detection

General rule - your training dataset should include such a set of relative sizes of objects that you want to detect:

train_network_width * train_obj_width / train_image_width ~= detection_network_width * detection_obj_width / detection_image_width

Since train_network_width == detection_network_width in your case, then you brake this rule:

train_obj_width / train_image_width ~= detection_obj_width / detection_image_width


From your images 0.9 != 0.1

Training images train_obj_width / train_image_width ~= 0.9: 1_m1ruwh49j3zz0llu-sot_q

Detection images train_obj_width / train_image_width ~= 0.1: 1_dii1ojdpoqg1yf31voovog

NamburiSrinath commented 5 years ago

Thank you so much Alexey for this lucid explanation.

So you are saying to train it with images which look like this 1_xlabr2x5uo4jwbmmuh-qbw

rather than the number plate image directly.(which I have posted earlier)

Also, should I need to include any negative samples while training(like, hey, this is not a number plate!!!)?? Or YOLO takes care of that

Thanks in advance

AlexeyAB commented 5 years ago

So you are saying to train it with images which look like this 1_xlabr2x5uo4jwbmmuh-qbw

Yes.

And with images like this:

49936274-90caab80-fee4-11e8-8b34-d3dd1eaa1e30


Also, should I need to include any negative samples while training(like, hey, this is not a number plate!!!)??

Yes. This is optional, but this is very desirable: https://github.com/AlexeyAB/darknet#how-to-improve-object-detection

desirable that your training dataset include images with non-labeled objects that you do not want to detect - negative samples without bounded box (empty .txt files) - use as many images of negative samples as there are images with objects

NamburiSrinath commented 5 years ago

Thanks again,

Small clarification Alexey.

What is the difference between first and second image that you have posted. Is it the clarity, position, rotation that's concerned?

Also, if I give negative samples the text file should be empty, that's fine. Still the classes =1 and no need to add any new label names in obj.names file right.

Negative samples means I can take any random images and place it in the training folder right. Any constraints on number or anything

Thanks in advance Srinath

AlexeyAB commented 5 years ago

What is the difference between first and second image that you have posted. Is it the clarity, position, rotation that's concerned?

As I see

for the 1st image train_obj_width / train_image_width ~= 0.3 for the 2nd image train_obj_width / train_image_width ~= 0.15

The more Training images similar to the Detection images - the better.


Also, if I give negative samples the text file should be empty, that's fine. Still the classes =1 and no need to add any new label names in obj.names file right.

Negative samples means I can take any random images and place it in the training folder right. Any constraints on number or anything

Yes, just add any images.

Then, for example, run https://github.com/AlexeyAB/Yolo_mark and go through all the images by hold SPACE button, so empty txt-files will be created automatically and these images will be added to the train.txt.

NamburiSrinath commented 5 years ago

Thanks a lot Alexey for helping me out. I have few queries though

I have taken 1500 images(900 positive and 600 negative) as training set and 400images(300 positive and 100 negative) as valid set.[Total 1900 images]

000190 - Negative image given in training set

  1. Is it correct/okay if the negative images also have number plate attached to them?

screenshot from 2018-12-15 19-14-13 screenshot from 2018-12-15 22-44-32 2.Here are some of the screenshots that I attach when it was training. My dataset has only 1900 images but what does the number indicate there(83072 images, 66816 images). If it is data-augmentation please be specific as what effects are made on image and how many images it will produce etc.

3.How to improve the detection? The average loss error came less than 0.06xx as mentioned in your repo and it is detecting a few number plates correctly. But I want to improve its detection as

img_3008_120_0

img_3030_137_0

img_3224_151_0

img_87_5_0

I want to process a video(traffic video in particular) and get the number plate of vehicles. Now it is detecting plates but with low accuracy and wrong bounding boxes too. Please give suggestions as what to do so that accuracy increases and it becomes easier for OCR to understand number plate.

Thanking you Srinath

AlexeyAB commented 5 years ago

@NamburiSrinath

  1. Is it correct/okay if the negative images also have number plate attached to them?

It is very bad. Negative image must not have any objects. Because all object must be labeled, but negative images doesn't have labels: https://github.com/AlexeyAB/darknet#how-to-improve-object-detection

check that each object are mandatory labeled in your dataset - no one object in your data set should not be without label. In the most training issues - there are wrong labels in your dataset (got labels by using some conversion script, marked with a third-party tool, ...). Always check your dataset by using: https://github.com/AlexeyAB/Yolo_mark


2.Here are some of the screenshots that I attach when it was training. My dataset has only 1900 images but what does the number indicate there(83072 images, 66816 images). If it is data-augmentation please be specific as what effects are made on image and how many images it will produce etc.

Yes, it is data augmentation. There will be generated infinity number of changed images. Will be changed randomly: color, size and aspect ratio. (params in cfg file: hue, exposure, saturation, jitter, random)


3.How to improve the detection? The average loss error came less than 0.06xx as mentioned in your repo and it is detecting a few number plates correctly. But I want to improve its detection as img_3008_120_0

img_3030_137_0

img_3224_151_0

img_87_5_0

izesaon commented 5 years ago

resize

is resizing the same as cropping from the image to get the desired size? or resizing is simply resizing the image to the desired size while maintaining aspect ratio? i been trying to figure out why my model is not detecting properly- the loss is decreasing but so is the obj score...

dis-is-pj commented 5 years ago

How do I crop the prediction? I'm also working on number plate detection.

NamburiSrinath commented 5 years ago

resize

is resizing the same as cropping from the image to get the desired size? or resizing is simply resizing the image to the desired size while maintaining aspect ratio? i been trying to figure out why my model is not detecting properly- the loss is decreasing but so is the obj score...

No!!

Resizing and cropping the image are two different aspects.

Assume you have an image of a car

  1. Suppose, in the given example to detect number plates, if you crop the number plates)(from the car image) and feed it to the architecture, then your model will not recognize number plates when you give a test image(which has a car and a number plate)

  2. If you resize the image(which has car and number plate), label it properly and feed it to network, then your model will detect a number plate when a test image is passed(which has a car and a number plate)

So, these two are different aspects. In general, while you are cropping, you are eliminating data from the image but in resizing you are not doing so.

Note: DNN Architectures do have constraints on the size of the image. So, always you can't expect them to detect a small football from a satellite image.(it depends though, but in my case YOLO is not very good for detecting small objects)

NamburiSrinath commented 5 years ago

How do I crop the prediction? I'm also working on number plate detection.

Hello dis-is-pj

If you are working on Alexey AB, you can save the coordinates outputted by the network into a file. You can write a small script that takes the image as input(READ IMAGE) and crops it using these coordinates which you get from architecture. There are direct functions in OpenCV to do that. It's fairly simple (Check the format ---> image[a:b, c:d])