NormXU / ERNIE-Layout-Pytorch

An unofficial Pytorch implementation of ERNIE-Layout which is originally released through PaddleNLP.
http://arxiv.org/abs/2210.06155
MIT License
96 stars 11 forks source link

Outputs are driven to zero when there's a strong imbalance #24

Open DiddyC opened 1 week ago

DiddyC commented 1 week ago

Hi,

I recently upgraded to PyTorch 2.x, using the latest code from the repository. While training a Named Entity Recognition (NER) classification model, I've noticed that when the majority of the tokens belong to a single class (e.g., class 0), the model converges and predicts only that majority class. This happens regardless of the batch size or learning rate I select.

Interestingly, this issue did not occur when using PyTorch 1.8. Has anyone else encountered this problem? Any insights or solutions would be greatly appreciated!

Thanks in advance for your help!

NormXU commented 1 week ago

@DiddyC That sounds a bit unusual. Could you provide a sample test so I can run the forward pass step by step?