ankanbhunia / Handwriting-Transformers

Handwriting-Transformers (ICCV21)
MIT License
165 stars 36 forks source link

For other language's handwriting generation--how to properly fine-tune the model? #15

Open rainchamber opened 1 year ago

rainchamber commented 1 year ago

@ankanbhunia

While you mention: "You can train the model in any custom dataset other than IAM and CVL. The process involves creating a dataset_name.pickle file and placing it inside files folder. The structure of dataset_name.pickle is a simple python dictionary."

I go through the codes and the way you suggest seems to retrain the model on another dataset from scratch rather than fine-tuning your model. As your paper has not yet much discussed this, I come to ask your opinion. If I want to apply the model to generate other language's handwritings e.g. Japanese, is there a way to quickly fine-tune your model upon a new Japanese handwriting data? Or do I need to retrain it from scratch?

ankanbhunia commented 1 year ago

I haven't tried to fine-tune for a different language, so I can not tell for sure what may happen. However, here's what I think should happen:  For a different language, you need to change the OCR network's last layer accordingly. I think the knowledge of a fully trained OCR network is very specific to that language. As a result, I believe there is no benefit to using OCR weights from a different language. In other words, you may need to initialize the OCR weights randomly. You can, however, reuse the weights of the generator and the discriminator from IAM pretrained weights. This way, it can potentially reduce the overall training time. 

If you proceed in this manner, you may encounter training instabilities though. To deal with this, you can pretrain the OCR separately before plugging it into the end-to-end training with the GAN. 

Having said that, I'd suggest retraining it from scratch. It will be more straightforward :).  

rainchamber commented 1 year ago

@ankanbhunia Thanks for the reply!

Just a quick follow-up question: if I just tried to work on a historical English handwriting dataset (maybe extracted from historical handwriting script through old bibles), putting the OCR network issue aside, will you suggest training it from scratch or fine-tuning the model?

Thanks in advance for letting me know it! I'm trying to extend your cool paper to do some projects.

ankanbhunia commented 1 year ago

In that case, I'd suggest to try first fine-tuning the model.

rainchamber commented 1 year ago

Thanks!