clovaai / donut

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
https://arxiv.org/abs/2111.15664
MIT License
5.52k stars 443 forks source link

Idea: Freezing SwinEncoder and fine-tuning BARTdecoder only on custom data #275

Open jackkwok opened 7 months ago

jackkwok commented 7 months ago

My goal is to be able to finetune on my consumer-grade NVidia RTX GPU which has only 8GB of memory.

The Donut architecture has a SwinEncoder followed by a BARTDecoder.

I plan to freeze all the layers in SwinEncoder by setting requires_grad to False and fine-tune only the BARTDecoder layers.

Has anyone tried this approach already? Was it successful for your case?

KartavyaBagga commented 7 months ago

I can run this but will it not forget pretrained-configuration from the base model after fine tuning ?

rodrigomeireles commented 6 months ago

Hey @jackkwok how did that turn out? Any experiment results you can share?

praneetreddy017 commented 5 months ago

+1 any results that you can share?

balajiChundi commented 5 months ago

Hey @jackkwok, please share the experiment outcome!!