Open yassmine-lam opened 3 years ago
Thank you for watching my code.
I don't use fine-tuning. The bert part of the netwok. but, I throw gradient informations of bert embeddings. So, bert doesn't learn datasets.
Thank you.
hi,
thank u for the quick answer
Do u think fine-tuning BERT on the datasets will increase the performance of ur code?
because I ve read many blogs that said that fine-tuning bert is better than extracting features without fine-tuning they said that fine-tuning require less labeled data than a model built fron scratch
what do u think?
Thank u
Sorry, I can't understand "they said that fine-tuning require less labeled data than a model built fron scratch" I think that fine-tuning of bert has two patterns.
Only use bert. This is the pre-training. Bert learn representation of datasets. After that, learn the model with bert was learned.
Use bert with other layers like my code. (My code is turn off the fine-tuning) This is not pre-training. Bert part of the model. Bert and other layers learn the datasets by gradient information at a same time. But, if model has too many layers, bert parameters are broken. (broken mean bert can't learn)
If I turn on the fine-tuning, select 1. And I think that increase the performance.
I am beginner. So, my thinking may are wrong. And I'm not good at English. So sorry.
Thank you.
ok I understand and thank u very much for taking time to answer
btw : I am a beginner and my english is not that good too :) I don t think this should be a problem, the more important is that we are trying to learn and share ideas with others :)
About the sentence u did not understand here is a blog https://pysnacks.com/machine-learning/bert-text-classification-with-fine-tuning/
in which different methods of using bert are discussed and the author said that fine-tuning is recommended and gave some reasons including the need for less labeled data for learning parameters compared to a model built from scratch because I am trying to learn nlp with deep learning and I have a small dataset This is why I am trying to find the best way to overcome this problem and achieve good results
Thank s again
Hi,
Thank s for sharing ur code
The bert embeddings u used are fine-tuned or not ?
Thank u