google-research / bert

TensorFlow code and pre-trained models for BERT
https://arxiv.org/abs/1810.04805
Apache License 2.0
37.95k stars 9.57k forks source link

Fine tuning vs feature extraction methods using BERT #1187

Open yassmine-lam opened 3 years ago

yassmine-lam commented 3 years ago

Hi,

I found in some blogs that fine-tuning a BERT model is better than extracting features from bert without fine-tuning and then train a neural network from scratch and they justify this by the fact that fine-tuning a pre-trained model require less labeled data for training than a model built from scratch is there someone who has tried to compare these two methods?

Thank u

joelcarbonera commented 1 year ago

Hi,

I found in some blogs that fine-tuning a BERT model is better than extracting features from bert without fine-tuning and then train a neural network from scratch and they justify this by the fact that fine-tuning a pre-trained model require less labeled data for training than a model built from scratch is there someone who has tried to compare these two methods?

Thank u

https://aclanthology.org/W19-4302.pdf