facebookresearch / XLM

PyTorch original implementation of Cross-lingual Language Model Pretraining.
Other
2.87k stars 495 forks source link

fine-tuning XLM for NER, loss decreases but F1 score remain unchanged #132

Open stefensa opened 5 years ago

stefensa commented 5 years ago

Hi, I want to fine-tuning XNLI-15 pertained model for NER downstream task, so i added a BiLSTM and CRF architecture.

But a weird thing happened, the loss of my model keep decreasing, but the F1 score remain unchanged in an initial status at epoch0, iteration 1. I don't know the reason about it, anyone met this phenomenon before? Appreciate it for your help. This is the initial state of my model performance.

WechatIMG37

And this is the performance after 9 epoch and 1840 iterations(batch_size is 32)

WechatIMG36

My code have been uploaded to GitHub. https://github.com/stefensa/XLM_NER Just python (your_file_dir)/fine_tuning/ner.pyto begin training. Moreover, the pertained model I used is 15 language MLM + TLM model https://dl.fbaipublicfiles.com/XLM/mlm_tlm_xnli15_1024.pth and just put it in (your_file_dir)/model

Ps: The code for CRF is from https://pytorch-crf.readthedocs.io/en/stable/#api-documentation So it need to be installed by pip install pytorch-crf

stefensa commented 5 years ago

I just spend a whole day to figure out the reason of this bug but I failed, thanks for any help.