Belval / TextRecognitionDataGenerator

A synthetic data generator for text recognition
MIT License
3.28k stars 977 forks source link

Data Generator working manner #237

Open Tailor2019 opened 2 years ago

Tailor2019 commented 2 years ago

@Belval @bact @junxnone @yifeitao @gachiemchiep Hello! Thanks a lot for this great Generator! Please What is the specific caracteristics of these synthetic images generated by these tools? How can it proceeds when the input is an image containing a text(how it takes the text and generate images) Thanks a lot for your help!

gachiemchiep commented 2 years ago

@Tailor2019 Sorry I don't understand your question.

Tailor2019 commented 2 years ago

@gachiemchiep Thanks for your interest about my question I mean what operations effected on a text to generate these synthetic data? what difference between the 1000 generated images? Thanks in advance!

gachiemchiep commented 2 years ago

@Tailor2019 Each image is generated using

  1. randomly selected text from a dictionary file
  2. randomly selected font from font directory
  3. random select color value from a color range
  4. randomly select a background image / or color

So generated images are randomness. All you can do is collect as many fonts as possible and large enough background images. Personally, I use MINC-2500 dataset as background images.

Tailor2019 commented 2 years ago

Thanks for your reply! @gachiemchiep when I'm using the option "-d" that Defines an image directory to use when background is set to image I'm using this image: 0137 But after generation of the data the background of the images is as these images: ﻂﺳﺍﻭﺃ ﻲﻓ ﻥﺍﻮﺴﺘﻴﺷ ﺔﻌﻃﺎﻘﻣ ﻯﺮﻗ ﻯﺪﺣﺈﺑ ﺔﻴﻧﺍﺪﻴﻣ ﺔﺳﺍﺭﺪﺑ ﻡﺎﻗ ﻱﺬﻟﺍ ،ﻲﻟ ﻦﺸﺗ ﻲﻧﺎﻜﺴﻟﺍ ﺚﺣﺎﺒﻟﺍ ﻝﻮﻘﻳ_4 ﻂﺳﺍﻭﺃ ﻲﻓ ﻥﺍﻮﺴﺘﻴﺷ ﺔﻌﻃﺎﻘﻣ ﻯﺮﻗ ﻯﺪﺣﺈﺑ ﺔﻴﻧﺍﺪﻴﻣ ﺔﺳﺍﺭﺪﺑ ﻡﺎﻗ ﻱﺬﻟﺍ ،ﻲﻟ ﻦﺸﺗ ﻲﻧﺎﻜﺴﻟﺍ ﺚﺣﺎﺒﻟﺍ ﻝﻮﻘﻳ_5 ﻂﺳﺍﻭﺃ ﻲﻓ ﻥﺍﻮﺴﺘﻴﺷ ﺔﻌﻃﺎﻘﻣ ﻯﺮﻗ ﻯﺪﺣﺈﺑ ﺔﻴﻧﺍﺪﻴﻣ ﺔﺳﺍﺭﺪﺑ ﻡﺎﻗ ﻱﺬﻟﺍ ،ﻲﻟ ﻦﺸﺗ ﻲﻧﺎﻜﺴﻟﺍ ﺚﺣﺎﺒﻟﺍ ﻝﻮﻘﻳ_6 ﻂﺳﺍﻭﺃ ﻲﻓ ﻥﺍﻮﺴﺘﻴﺷ ﺔﻌﻃﺎﻘﻣ ﻯﺮﻗ ﻯﺪﺣﺈﺑ ﺔﻴﻧﺍﺪﻴﻣ ﺔﺳﺍﺭﺪﺑ ﻡﺎﻗ ﻱﺬﻟﺍ ،ﻲﻟ ﻦﺸﺗ ﻲﻧﺎﻜﺴﻟﺍ ﺚﺣﺎﺒﻟﺍ ﻝﻮﻘﻳ_7

Please how it effect these background? did it take an horizontally part of the image used for the background? How it mesure its width? did it take another background from the folder images of this prject(for example the last image) ? Thanks to help me understand !

gachiemchiep commented 2 years ago

@Tailor2019 The logic of using background images is defined inside this method. https://github.com/Belval/TextRecognitionDataGenerator/blob/ab83b94fd10ecdace77c77fddb2727d8e4c85289/trdg/background_generator.py#L58

The background image is randomly crop and resize. You can change it to fit your need.

Tailor2019 commented 2 years ago

thanks a lot it is very helpful! @gachiemchiep for this option -hw there use of "model-29.data-00000-of-00001" in this case the generator will use the model to predict the text in the image or what? Thanks to explain this .