writer / fitbert

Use BERT to Fill in the Blanks
https://pypi.org/project/fitbert/
Apache License 2.0
82 stars 14 forks source link

Fine-tune FitBERT with my own dataset #21

Closed joshhu closed 2 years ago

joshhu commented 3 years ago

Hi,

I want to do a masked word prediction from a list of candidates model。How can I train my own dataset? For example, my training data looks like:

Sentence --> This river runs *mask*** the plain. Correct --> through Sentence --> We have a brilliant mask***. Correct --> thought

I just want the model to pick the right word from candidates list ['through', 'thoughts'] for a test sentence 'You don't know how much I've been mask"

How can I fine-tune this kind of model based on FitBert?

Thank you.

sam-writer commented 2 years ago

you would fine-tune a BERT model in the standard way, and then use that model with fitbert. predicting the masked word is the main BERT task. I would use transformers like this: https://colab.research.google.com/github/huggingface/notebooks/blob/master/examples/language_modeling.ipynb#scrollTo=a3KD3WXU3l-O