Closed manueldiduch closed 5 years ago
There may be several issues. (1) Do you check the cropped images? Are they cropped correctly? Expanding the bounding boxes may help a little but not too much. You can have a try. (2) In the paper for end-to-end experiments, we use multi-scale testing for better recall. (3) The CRNN model we use in the paper is a TPAMI version model while the released CRNN model is an Arxiv version model. They are trained with slightly different training strategies. But I think these two models have similar performances.
Thanks for your help :) I am executing some experiments in order to reproduce your end-to-end results.
Hi @MhLiao!!
I "reproduced" your textLocalization results with a little difference on the results. But, I am not getting good results in the end-to-end protocol :/ I tried with the generic Lexicon and without it (I am using your demo.py as baseline). On the ICDAR15, in terms of F-measure, my results are about 37% with Lexicon and 32% without it.
Can you help me with some directions, please?
Could be a crnn_model problem? Is the available crrn_model updated? The cropped and resized images need to be explored, i.e., increase their height and/or width?
Thanks for your help!!
An example is attached to see the difference with your result (paper). Apparently the bounding boxes are similar but the recognition results are different :(