generate IAMSyntheticParagraph entries on the fly

the-full-stack / fsdl-text-recognizer-2022

Source of the FSDL 2022 labs, which are at https://github.com/full-stack-deep-learning/fsdl-text-recognizer-2022-labs

https://fullstackdeeplearning.com/course

MIT License

82 stars 26 forks source link

generate IAMSyntheticParagraph entries on the fly #50

Closed charlesfrye closed 2 years ago

charlesfrye commented 2 years ago

Instead of randomly partitioning IAMLines into new paragraphs and saving the result to disk, we now generate a new paragraph on the fly inside the DataLoader from lines that are saved to disk.

Specifically, we add a PyTorch Dataset that combines lines together into paragraphs at indexing time, using the first provided index as a seed.

supercedes #40