mila-iqia / fuel

A data pipeline framework for machine learning
MIT License
867 stars 268 forks source link

Add a large MJSynth dataset of cropped words images #353

Open ablavatski opened 8 years ago

ablavatski commented 8 years ago

Add the large MJSynth dataset of cropped words images for words recognition. The dataset was proposed in the paper Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition, Jaderberg, M. and Simonyan, K. and Vedaldi, A. and Zisserman, A., 2014 and this dataset consists of 9 million images covering 90k English words, and includes the training, validation and test splits used in the work.