bgshih / crnn

Convolutional Recurrent Neural Network (CRNN) for image-based sequence recognition.
MIT License
2.06k stars 552 forks source link

The image of long width has a bad result, the short one does not #41

Open Jayhello opened 7 years ago

Jayhello commented 7 years ago

For example the good result with short width image image

the bad result with long width image image

Duum commented 7 years ago

because the best path decode method is not work well in long term , please change another ctc decode method

Jayhello commented 7 years ago

@Duum Thank very much for your reply. But how to do this in detail ? And can you give some advise about the question https://github.com/bgshih/crnn/issues/39

Duum commented 7 years ago

you can try Prefix Search Decoding or other decoding method.

misssprite commented 7 years ago

@Duum Why does the length of label have a negative effect on the precision of best path decoding? Is there any literature talking about this?

hellbago commented 7 years ago

I think the problem is given by the rescaling of the image to 100x32 size. For long sequence images, by applying this rescaling, single character appears very crushed, and this fact can impact the classification. I don't think best path decoding ca influence the results in such a way.

Heisenberg0391 commented 5 years ago

@hellbago crnn is able to handle images of arbitrary width. So, for testing, i think you only have to resize the image along the height dimension while preserving aspect ratio, otherwise you will get a distorted image like you said