ankush-me / SynthText

Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Vedaldi, Andrew Zisserman, CVPR 2016.
http://www.robots.ox.ac.uk/~vgg/data/scenetext/
Apache License 2.0
2.03k stars 622 forks source link

postprocessing code from Jaderberg et al #9

Closed biggerlambda closed 7 years ago

biggerlambda commented 7 years ago

I came here after reading the CVPR 2016 paper. Is the post-processing code/models from Jaderberg et al (used in the paper) available at all ? Thanks!

biggerlambda commented 7 years ago

ping!

cjnolet commented 7 years ago

I had to basically write and train the filtering classifier & regression framework myself. I ended up using a CNN for the filtering classifier and got much more performance than I did with the HOG features & random forest. Either way, I did not find this code anywhere. The non-maximal suppression algorithm is supplied in Piotr Dollar's image processing toolbox (but it's not too hard to write on your own). The algorithms that were used for the initial bounding box proposals are available in Piotr Dollar's toolbox as well. I was not able to get the performance out of my own trained ACF detector that Jaderberg acheived but I did find that using Selective Search (I can post the reference if you are interested) in combination with the EdgeBoxes and ACF gave me 99.8% recall for the dataset I generated using Ankush's SynthText scripts.

biggerlambda commented 7 years ago

Thanks, this is very helpful. Can you please post the reference as well?

What was the overall prediction performance of the model in terms of timing ?

cjnolet commented 7 years ago

I'm actually currently struggling with that at work. The selective search algorithm was very good in terms of recall but in order to get that number, I had to turn up the max amount of bounding boxes that can get proposed. As a result, between the three algorithms, I have to run over 200k bounding boxes through the filtering classifier. This can take 1 to several minutes. I am still in the process of figuring out thresholds so that I can filter the initial proposals by their scores before hand. Filtering them at .5 (and 0 in the case of selective search since it's scale goes from -1 to 1) seemed to lower the recall quite a bit.

I will post the reference tomorrow while at work. I have all my bibtexs on my work computer.

Sent from my iPad

On Jan 10, 2017, at 2:06 AM, captain sparrow notifications@github.com wrote:

Thanks. Can you please post the reference as well?

What was the overall prediction performance of the model in terms of timing ?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

ankush-me commented 7 years ago

As far as I know, the post-processing code from Jaderberg et al. is not available yet.

cjnolet commented 7 years ago

https://arxiv.org/abs/1604.02619

On Tue, Jan 10, 2017 at 10:46 AM, Ankush Gupta notifications@github.com wrote:

As far as I know, the post-processing code from Jaderberg et al. is not available yet.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ankush-me/SynthText/issues/9#issuecomment-271610550, or mute the thread https://github.com/notifications/unsubscribe-auth/ABL1YJpFkBApbAtkTadGO5IaBhputB2Iks5rQ6fCgaJpZM4LdI9R .

biggerlambda commented 7 years ago

Thanks all