MhLiao / TextBoxes_plusplus

TextBoxes++: A Single-Shot Oriented Scene Text Detector
Other
954 stars 279 forks source link

Recognition Results issue #101

Closed manueldiduch closed 5 years ago

manueldiduch commented 5 years ago

Hi @MhLiao!!

I "reproduced" your textLocalization results with a little difference on the results. But, I am not getting good results in the end-to-end protocol :/ I tried with the generic Lexicon and without it (I am using your demo.py as baseline). On the ICDAR15, in terms of F-measure, my results are about 37% with Lexicon and 32% without it.

Can you help me with some directions, please?

Could be a crnn_model problem? Is the available crrn_model updated? The cropped and resized images need to be explored, i.e., increase their height and/or width?

Thanks for your help!!

An example is attached to see the difference with your result (paper). Apparently the bounding boxes are similar but the recognition results are different :( converse

MhLiao commented 5 years ago

There may be several issues. (1) Do you check the cropped images? Are they cropped correctly? Expanding the bounding boxes may help a little but not too much. You can have a try. (2) In the paper for end-to-end experiments, we use multi-scale testing for better recall. (3) The CRNN model we use in the paper is a TPAMI version model while the released CRNN model is an Arxiv version model. They are trained with slightly different training strategies. But I think these two models have similar performances.

manueldiduch commented 5 years ago

Thanks for your help :) I am executing some experiments in order to reproduce your end-to-end results.