microsoft / unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
https://aka.ms/GeneralAI
MIT License
20.19k stars 2.55k forks source link

Layoutlmv3 LayoutLMv3ForSequenceClassification NaN loss #1489

Open Rithsek99 opened 7 months ago

Rithsek99 commented 7 months ago

Hello,

I'm trainining Layoutlmv3 LayoutLMv3ForSequenceClassification and facing issue where loss goes to NaN on very first iteration. I've tried to increase batch size and gradient accumulation with lower lrate but it's still goes to NaN. I'm training with pytorch lightning.

My source code is pretty much this https://www.mlexpert.io/machine-learning/tutorials/document-classification-with-layoutlmv3 (layoutlmv3-large) with little tweak.

@HYPJUDY

thanks