qjadud1994 / CRNN-Keras

CRNN (CNN+RNN) for OCR using Keras / License Plate Recognition
MIT License
521 stars 191 forks source link

increase input size of Network #23

Open ghost opened 5 years ago

ghost commented 5 years ago

Hi, Is it possible to increase the input size of network from 128x64 to 256x128 ?

qjadud1994 commented 5 years ago

Of course it is possible. Just add a downsample(maxpooling) to get the output of 32-d or just use the output of 64-d.

ghost commented 5 years ago

Hi @qjadud1994 , if i add one downsample to get output of 32-d, is it necessary to change the downsample_factor to other value? and please explain about this parameter, i don't know what't this? what's applicable in the training?

def ctc_lambda_func(args):
    y_pred, labels, input_length, label_length = args
    # the 2 is critical here since the first couple outputs of the RNN
    # tend to be garbage:
    y_pred = y_pred[:, 2:, :]
    return K.ctc_batch_cost(labels, y_pred, input_length, label_length)

why you don't use 0,1 in y_pred[:, 2:, :]?

input_length = np.ones((self.batch_size, 1)) * (self.img_w // self.downsample_factor - 2) # (bs, 1) why you use (self.img_w // self.downsample_factor - 2) ? mines 2 ? in the define function ctc loss: you comment this:

    # the 2 is critical here since the first couple outputs of the RNN
    # tend to be garbage:

why 2 is critical and tend to be garbage? if i use input_size 256*128, Does this number change (2)? and if i want to be the aspect-ratio input_size of crnn = 5, what do i do?