Is it really need to fine-tune for 30 epochs?

sinovation / ZEN

A BERT-based Chinese Text Encoder Enhanced by N-gram Representations

Apache License 2.0

641 stars 104 forks source link

Is it really need to fine-tune for 30 epochs? #6

Closed vikotse closed 4 years ago

vikotse commented 4 years ago

I have trained 30 epochs in a classification dataset, according to examples/README.md run_sequence_level_classification.py, and got a bad performance (test acc < 50) that is not fit to expectations (test acc > 80 based on BERT).

Or I just need to fine-tune for 3 epochs just like fine-tune BERT?

GuiminChen commented 4 years ago

We just provide a general usage in the example for your reference and you could adjust the hyper-parameters according to your task and dataset. From our experience, 3 epochs are enough to get a good result. Therefore it seems something might go wrong with your fine-tuning. Anyway, you are welcome to provide more details to us.