microsoft / unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
https://aka.ms/GeneralAI
MIT License
20.26k stars 2.56k forks source link

trOCR base model 30% CER on IAM word dataset vs 4% for IAM line dataset, is this normal? #1653

Open slender9168 opened 1 week ago

slender9168 commented 1 week ago

Describe the bug Model I am using: trocr-base-handwritten

Dataset:

The problem arises when using:

`

A clear and concise description of what the bug is: when running microsoft/trocr-base-handwritten against the IAM word dataset ( single words ), I got a CER of about 30% when running it against the IAM line dataset, the CER is about 4%

  1. is this expected?
  2. can I train the model on single word images to enhance its performance on single words to 4% CER? or is it inherintally bad on single words?
  3. is the model being trained on full lines instead of single words, the reason for the 30% CER?

To Reproduce Steps to reproduce the behavior:

  1. use the sample code with microsoft/trocr-base-handwritten against the IAM word dataset, the CER will be aroud 30%