herobd / handwriting_line_generation

Code for BMVC2020 paper "Text and Style Conditioned GAN for Generation of Offline Handwriting Lines"
Other
66 stars 28 forks source link

Format of your dataset and utilities #27

Open rezwanh001 opened 2 years ago

rezwanh001 commented 2 years ago

Dear Concern, It's great project. But It is hard for me for exploring the exact format of your dataset. Please give a brief review on format of Dataset and need ways/utilities for other language's offline handwritten line generation. Thanks.

herobd commented 2 years ago

The project used two datasets, the IAM and RIMES. The objects in datasets/ are used to read them in their individual formats and then prepare the data for what the trainer and model expect. If you're wanting to use this on your own dataset/new language, I'd recommed looking at what I said here: https://github.com/herobd/handwriting_line_generation/issues/23

The data passed to the trainer from the __getitem__ function of the dataset objects is a dictionary with these elements:

You'll see a couple more things returned from the actually dataset files, but they aren't used by the trainer.