NielsRogge / Transformers-Tutorials

This repository contains demos I made with the Transformers library by HuggingFace.
MIT License
8.51k stars 1.34k forks source link

Vision Encoder Decoder Overfits #295

Open Serge9744 opened 1 year ago

Serge9744 commented 1 year ago

Hi @NielsRogge , I have trained a Vision Encoder Decoder on French medical data , 2000 images on the training set , 330 for dev and test. I augmented through random noie , blurring, contrasting, dowsnizing, upsizing,eroding , dilating , brightning to get roughly 20 000 images for training

I used Swin v2 base 256 (the images are roughly rectangular around 400pxl height and 1500 pxl wide) and Dr Bert small (but any decoder I used gave the same result)

The model overfits on the training data . Here is an example of a trained model on the training data (CER is around 0) :

<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">

True labels :  ['Demande administrative', 'Renouvellement', 'echographie du coeur, coronaire', 'EMPHYSME', '', 'amoxicilinne, prednisolone et ventoline', 'radios dos', 'Greffe Rénal / Annuloplastie Aortique'] |   -- | -- predictions :  ['Demande administrative', 'Renouvellement', 'echographie du coeur, coronaire', 'EMPHYSME', '', 'amoxicilinne, prednisolone et ventoline', 'radios dos', 'Greffe Rénal / Annuloplastie Aortique'] |   True labels :  ['TENSiON', 'Dépression nerveuse', 'Prise de sang de contrôle', 'cesarienne', 'Rhumatismes - athrose', 'Maladie Professionnelle tondinopathie 2011 année', 'PARACETAMOL', 'radio controle bilan sanguin'] predictions :  ['TENSiON', 'Dépression nerveuse', 'Prise de sang de contrôle', 'cesarienne', 'Rhumatismes - athrose', 'Maladie Professionnelle tondinopathie 2015', 'PARACETAMOL', 'radio controle bilan sanguin']

However I keep having 0.66 CER for the dev set. It is steady from epoch 15 up to 30 where I stopped the training.

My learning rate is 2e-5 .

Do you have a general recommandation apart from getting more images to decrease overfitting please? I do not know how can I tackle this issue

Best regards