amzn / convolutional-handwriting-gan

ScrabbleGAN: Semi-Supervised Varying Length Handwritten Text Generation (CVPR20)
https://www.amazon.science/publications/scrabblegan-semi-supervised-varying-length-handwritten-text-generation
MIT License
264 stars 55 forks source link
cvpr cvpr-2020 cvpr20 cvpr2020 domain-adaptation gan handwriting ocr semi-supervised transfer-learning unlabeled-data

ScrabbleGAN: Semi-Supervised Varying Length Handwritten Text Generation

This is a pytorch implementation of the paper "ScrabbleGAN: Semi-Supervised Varying Length Handwritten Text Generation"

Dependency

Training

Supervised Training

 python train.py --name_prefix demo --dataname RIMEScharH32W16 --capitalize --display_port 8192 

Semi-Supervised Training

python train_semi_supervised.py --dataname IAMcharH32W16rmPunct --unlabeled_dataname CVLtrH32 --disjoint

LMDB file generation for training data

Before generating an LMDB download the desired dataset into Datasets:

The structure of the directories should be:

To generate an LMDB file of one of the datasets CVL/IAM/RIMES/GW for training use the code:

cd data
python create_text_data.py

The generated lmdb will be saved in the relevant dataset folder and the dictionary with be saved in Lexicon folder.

Generating an LMDB file with GAN data

python generate_wordsLMDB.py --dataname IAMcharH32rmPunct --results_dir ./lmdb_files/IAM_concat --n_synth 100,200 --name model_name 

Main Folders

The structure of the code is based on the structure of the CycleGAN code.

  1. data/ - Folder containing functions relating to the data, including generation, dataloading, alphabetes and a catalog which translates dataset names into folder location. The dataset_catalog should be updated according to the path to the lmdb you are using.
  2. models/ - Folder containing the models (with the forward, backward and optimization functions) and the network architectures. The generator and discriminator architectures are based on BigGAN. The recognizer architecture is based on crnn.
  3. options/ - Files containing the arguments for the training and data generation process.
  4. plots/ - Python notebook files with visualizations of the data.
  5. util/ - General function that are used in packages such as loss definitions.

Citation

If you use this code for your research, please cite our paper.

@inproceedings{fogel2020scrabblegan,
    title={ScrabbleGAN: Semi-Supervised Varying Length Handwritten Text Generation},
    author={Sharon Fogel and Hadar Averbuch-Elor and Sarel Cohen and Shai Mazor and Roee Litman},
    booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    month = {June},
    year = {2020}
}

License

ScrabbleGAN is released under the MIT license. See the LICENSE and THIRD-PARTY-NOTICES.txt files for more information.

Contributing

Your contributions are welcome!
See CONTRIBUTING.md and CODE_OF_CONDUCT.md for more info.