shabie / docformer

Implementation of DocFormer: End-to-End Transformer for Document Understanding, a multi-modal transformer based architecture for the task of Visual Document Understanding (VDU)
MIT License
255 stars 40 forks source link

Predictions are wrong. #36

Open BakingBrains opened 2 years ago

BakingBrains commented 2 years ago

I am training docformer, 6 epochs completed. The predictions are not good. Can you please tell what is minimum number of epochs required to get better prediction on document classification?

Also, in hugging face demo, why the id2label is in this sequence?

id2label = ['scientific_report',
            'resume',
            'memo',
            'file_folder',
            'specification',
            'news_article',
            'letter',
            'form',
            'budget',
            'handwritten',
            'email',
            'invoice',
            'presentation',
            'scientific_publication',
            'questionnaire',
            'advertisement']
uakarsh commented 2 years ago

Hi there,

For Q1. I am really not sure, how many epochs would be required to get a good result since I have not experimented with it. But as soon as I something related to it, I would immediately get back to you. Q2. The id2label is in that specific order because while I was training the model on the Kaggle platform, this was the exact order for label encoding, i.e the 0th class was 'scientific_report' and so on.

Regards, Akarsh