Closed hoangdzung closed 4 years ago
You can not group 7 letters and 8 letters in the same batch.
Therefore, it is usually solved by adding padding.
If the max text length is 9, create a character with length 9 by adding padding before or after the 7-character length.
In addition, grouping texts with the same maximum number of characters in the same batch is efficient for learning.
@qjadud1994, what do you mean by "adding padding"?
.ljust(self.max_text_len)
.extend((self.max_text_len-len(text))*[36])
And there's a question where to add it, left or right?
padding can be any character such as space or *. However, the character specified by padding should not affect the prediction.
It does not matter where you put the padding to the left or right of the character.
ex) max_len = 5 [B, Y, E, , ] [H, I, , , *]
Do we need to add the character "*" (used for padding) to the list of characters? Because let's say my plate number is XYZ1234*.jpg it says character is not in the list.
And, if I add it, the accuracy is 0.
@qjadud1994 @soldierofhell @hmunshi I wonder if it will affect the performence of crnn if you add the "*" after a normal label?
The best method is to use 'blank' symbol for padding, like text_to_labels
function in this code:
https://github.com/tuanphan09/captcha-recognition/blob/master/data_gen.py
@xinyuegtxy if you add any character to the labels, it won't affect CRNN performence. I've already done that, but it will make your model a littel bit bigger (number of character increase 1). So the best way is to use 'blank' as i said above and it also makes sence.
Can your model deal with variable-length plate? In Image_Generator.py, you write Y_data[i]= text_to_labels(text), that mean "text" must have the length of 9. What about 7 or 8 characters? Or this model just work with 9-character plate? Thank you.