dailenson / SDT

This repository is the official implementation of Disentangling Writer and Character Styles for Handwriting Generation (CVPR23).
MIT License
956 stars 81 forks source link
computer-vision contrastive-learning deep-learning generative-models gmm handwriting-generation multimodal pytorch-implementation transformer

MIT LICENSE python 3.8

πŸ”₯ Disentangling Writer and Character Styles for Handwriting Generation

ArXiv | Poster | Video | Project

πŸ“’ Introduction

Overview of our SDT

Three samples of online characters with writing orders

πŸ“… News

πŸ“Ί Handwriting generation results

πŸ”¨ Requirements

conda create -n sdt python=3.8 -y
conda activate sdt
# install all dependencies
pip install -r requirements.txt

πŸ“‚ Folder Structure

  SDT/
  β”‚
  β”œβ”€β”€ train.py - main script to start training
  β”œβ”€β”€ test.py - generate characters via trained model
  β”œβ”€β”€ evaluate.py - evaluation of generated samples
  β”‚
  β”œβ”€β”€ configs/*.yml - holds configuration for training
  β”œβ”€β”€ parse_config.py - class to handle config file
  β”‚
  β”œβ”€β”€ data_loader/ - anything about data loading goes here
  β”‚   └── loader.py
  β”‚
  β”œβ”€β”€ model_zoo/ - pre-trained content encoder model
  β”‚
  β”œβ”€β”€ data/ - default directory for storing experimental datasets
  β”‚
  β”œβ”€β”€ model/ - networks, models and losses
  β”‚   β”œβ”€β”€ encoder.py
  β”‚   β”œβ”€β”€ gmm.py
  β”‚   β”œβ”€β”€ loss.py
  β”‚   β”œβ”€β”€ model.py
  β”‚   └── transformer.py
  β”‚
  β”œβ”€β”€ saved/
  β”‚   β”œβ”€β”€ models/ - trained models are saved here
  β”‚   β”œβ”€β”€ tborad/ - tensorboard visualization
  β”‚   └── samples/ - visualization samples in the training process
  β”‚
  β”œβ”€β”€ trainer/ - trainers
  β”‚   └── trainer.py
  β”‚  
  └── utils/ - small utility functions
      β”œβ”€β”€ util.py
      └── logger.py - set log dir for tensorboard and logging output

πŸ’Ώ Datasets

We provide Chinese, Japanese and English datasets in Google Drive | Baidu Netdisk PW:xu9u. Please download these datasets, uzip them and move the extracted files to /data.

πŸ” Pre-trained model

πŸš€ Training & Test

Training

Qualitative Test

Quantitative Evaluation

🏰 Practical Application

We are delighted to discover that P0etry-rain has proposed a pipeline that involves initially converting the generated results by our SDT to TTF format, followed by the development of software to enable flexible adjustments in spacing between paragraphs, lines, and characters. Below, we present TTF files, software interface and the printed results. More details can be seen in #78.

❀️ Citation

If you find our work inspiring or use our codebase in your research, please cite our work:

@inproceedings{dai2023disentangling,
  title={Disentangling Writer and Character Styles for Handwriting Generation},
  author={Dai, Gang and Zhang, Yifan and Wang, Qingfeng and Du, Qing and Yu, Zhuliang and Liu, Zhuoman and Huang, Shuangping},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,
  pages={5977--5986},
  year={2023}
}

⭐ StarGraph

Star History Chart