Open jackkwok opened 7 months ago
I can run this but will it not forget pretrained-configuration from the base model after fine tuning ?
Hey @jackkwok how did that turn out? Any experiment results you can share?
+1 any results that you can share?
Hey @jackkwok, please share the experiment outcome!!
My goal is to be able to finetune on my consumer-grade NVidia RTX GPU which has only 8GB of memory.
The Donut architecture has a SwinEncoder followed by a BARTDecoder.
I plan to freeze all the layers in SwinEncoder by setting
requires_grad
to False and fine-tune only the BARTDecoder layers.Has anyone tried this approach already? Was it successful for your case?