clovaai / donut

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
https://arxiv.org/abs/2111.15664
MIT License
5.83k stars 476 forks source link

Questions for Label of dataset customizing #216

Open JayMay1994 opened 1 year ago

JayMay1994 commented 1 year ago

Hi, I read your paper and find it really inspiring. I have some questions as follows:

  1. about own dataset making : Do I still need to label bounding box of each text area?
  2. Does model abandon predicting text bounding box like image detection?
sudhitpanchal commented 1 year ago
  1. Yes you need to label whatever you want to give in your image as input.

  2. Model doesn't abandon predicting it might not understand the words in the text but it will predict as per the data it is trained.