Open mck0225 opened 6 years ago
You should base your cfg-file on yolo-voc.2.0.cfg with neural network resolution close to image resolution.
Images range from (448 400) to (1920 1080) and vary. I am using yolo-voc.2.0.cfg and it is 832 * 832. I would like to use it to detect the positions of M, K, 3,5,3,2 in the sample image.
Here is the cfg file I used. [net] batch = 64 subdivisions = 16 width = 832 height = 832 channels = 3 momentum = 0.9 decay = 0.0005 angle = 0 saturation = 1.5 exposure = 1.5 hue = .1
learning_rate = 0.0001 max_batches = 590000 policy = steps steps = 60000, 90000, 100000 scales = .1, .1, .1
[convolutional] batch_normalize = 1 filters = 32 size = 3 stride = 1 pad = 1 activation = leaky
[maxpool] size = 2 stride = 2
[convolutional] batch_normalize = 1 filters = 64 size = 3 stride = 1 pad = 1 activation = leaky
[maxpool] size = 2 stride = 2
[convolutional] batch_normalize = 1 filters = 128 size = 3 stride = 1 pad = 1 activation = leaky
[convolutional] batch_normalize = 1 filters = 64 size = 1 stride = 1 pad = 1 activation = leaky
[convolutional] batch_normalize = 1 filters = 128 size = 3 stride = 1 pad = 1 activation = leaky
[maxpool] size = 2 stride = 2
[convolutional] batch_normalize = 1 filters = 256 size = 3 stride = 1 pad = 1 activation = leaky
[convolutional] batch_normalize = 1 filters = 128 size = 1 stride = 1 pad = 1 activation = leaky
[convolutional] batch_normalize = 1 filters = 256 size = 3 stride = 1 pad = 1 activation = leaky
[maxpool] size = 2 stride = 2
[convolutional] batch_normalize = 1 filters = 512 size = 3 stride = 1 pad = 1 activation = leaky
[convolutional] batch_normalize = 1 filters = 256 size = 1 stride = 1 pad = 1 activation = leaky
[convolutional] batch_normalize = 1 filters = 512 size = 3 stride = 1 pad = 1 activation = leaky
[convolutional] batch_normalize = 1 filters = 256 size = 1 stride = 1 pad = 1 activation = leaky
[convolutional] batch_normalize = 1 filters = 512 size = 3 stride = 1 pad = 1 activation = leaky
[maxpool] size = 2 stride = 2
[convolutional] batch_normalize = 1 filters = 1024 size = 3 stride = 1 pad = 1 activation = leaky
[convolutional] batch_normalize = 1 filters = 512 size = 1 stride = 1 pad = 1 activation = leaky
[convolutional] batch_normalize = 1 filters = 1024 size = 3 stride = 1 pad = 1 activation = leaky
[convolutional] batch_normalize = 1 filters = 512 size = 1 stride = 1 pad = 1 activation = leaky
[convolutional] batch_normalize = 1 filters = 1024 size = 3 stride = 1 pad = 1 activation = leaky
#######
[convolutional] batch_normalize = 1 size = 3 stride = 1 pad = 1 filters = 1024 activation = leaky
[convolutional] batch_normalize = 1 size = 3 stride = 1 pad = 1 filters = 1024 activation = leaky
[route] layers = -9
[reorg] stride = 2
[route] layers = -1, -3
[convolutional] batch_normalize = 1 size = 3 stride = 1 pad = 1 filters = 1024 activation = leaky
[convolutional] size = 1 stride = 1 pad = 1 filters = 320 activation = linear
[region] anchors = 0.631, 1.77, 1.01, 3.12, 1.15, 3.5, 1.2, 3.97, 1.42, 4.22 bias_match = 1 classes = 59 coords = 4 num = 5 softmax = 1 jitter = .2 rescore = 1
object_scale = 5 noobject_scale = 1 class_scale = 1 coord_scale = 1
absolute = 1 thresh = .6 random = 1
steps = 100, 10000, 60000, 90000, 100000
scales = 10, .1, .1, .1, .1
Good morning. Alexeyab. thank you for the reply.
We have detected about 0.3% of the characters in the currently learned weight. It is also not an exact character position.
The command line is: detector train TrainSet / MyData.data MyData.cfg darknet19_448.conv.23
I am currently trying to learn again as a suggestion. the resolution of the neural network is adjusted to 1024 * 1024, steps = 100, 10000, 60000, 90000, 100000 scales = 10, .1, .1, .1, .1
Can I judge that there is a problem in the area specified in the image used for the learning if the text does not detect well in the next learning result?
Good morning, @mck0225 You can try to check your training dataset using this tool: https://github.com/AlexeyAB/Yolo_mark Just quickly go through all the images (by pressing the SPACE-button) and make sure they are labeled correctly.
Hi. @AlexeyAB • Did you try to use Yolo just for detection separate letters, no whole words? Is this mean separate letters more good for detecting? I'm doing detect character. I was tried detect characters at once. Unfortunately, detected few characters. Thanks.
@Ahntw80 Yes, you can: https://github.com/AlexeyAB/darknet/issues/1112
Good morning. Now I want to use Yolo to detect the letters on the license plate and figure out the car number. There are 59 types of letters.
20000 After learning the results, I could not detect anything. I tried changing the anchors and not using the flip, but the results did not improve.
I would like to ask the advice of people who have detected the letters with yolo.