kamalkraj / ALBERT-TF2.0

ALBERT model Pretraining and Fine Tuning using TF2.0
Apache License 2.0
199 stars 45 forks source link

Freezing layers during training #26

Open birdmw opened 4 years ago

birdmw commented 4 years ago

When training I see progress followed by degradation. This is (likely) because the model is over fitting due to the limited corpus size of 8k samples. What is happening is we are overwriting the pre-trained weights in the fine-tuning task. What we would like to do is freeze the original layers. We need to figure out how to do this.