Open riteshKumarUMass opened 2 years ago
Could someone respond to this?
Hi, I have following 3 questions and would be really grateful if anyone could provide some insights:
- While pertaining the model on the text lines extracted from the PDFs and synthetic data, do you maintain the aspect ratio of the image while resizing it to 384x384 size? Using the HuggingFace's TROCR preprocessor, I noticed that it does not maintain the aspect ratio and therefore, would like to understand if this would affect model's performance.
- Did "textline" contain multiple words in a single image or did you split the image further at word level before feeding it to the model?
- Did you try training the model at word level instead of line level and notice any difference?
Hi, I have following 3 questions and would be really grateful if anyone could provide some insights: