amzn / convolutional-handwriting-gan

ScrabbleGAN: Semi-Supervised Varying Length Handwritten Text Generation (CVPR20)
https://www.amazon.science/publications/scrabblegan-semi-supervised-varying-length-handwritten-text-generation
MIT License
264 stars 55 forks source link

How to genreate training dataset CVLtrH32? #10

Open chenyangMl opened 3 years ago

chenyangMl commented 3 years ago

Thanks for this remarkable work, and I want to run semi-supervised training command with python train_semi_supervised.py --dataname IAMcharH32W16rmPunct --unlabeled_dataname CVLtrH32 --disjoint

So I have download cvl-database-1-1, and noticed the CVLtrH32 defination in "data/dataset_catalog.py" that looks like below "CVLtrH32": _DATA_ROOT+'CVL/h32/train_new_partition' But i can not create it with script "data/create_text_data.py", cause there is not corresponding paramtere of mode. The related code at line 89~100 of "create_text_data.py". if dataset=='CVL': root_dir = os.path.join(root_dir, 'cvl-database-1-1') if words: images_name = 'words' else: images_name = 'lines' if mode == 'tr' or mode == 'val': mode_dir = ['trainset'] elif mode == 'te': mode_dir = ['testset'] elif mode == 'all': mode_dir = ['testset', 'trainset']

I change the dataset key "CVLtrH32": _DATA_ROOT+'CVL/h32/tr', and create dataset with script. Then training working. but i can not sure it's correct or not.

So, how to create a correct dataset and make semi-supervised training working well? look forward your answer.

sharonFogel commented 3 years ago

It's correct, we just haven't updated the dataset dictionary after changing the data generation code.