stefan-it / turkish-bert

Turkish BERT/DistilBERT, ELECTRA and ConvBERT models
482 stars 42 forks source link

Commands #8

Closed cbalkig closed 4 years ago

cbalkig commented 4 years ago

Stefan,

Is it possible to share the commands that you've used for generating the model? I'll add some domain specific data to the model and for retraining I need you commands.

Thanx a lot.

stefan-it commented 4 years ago

Hi @balki7 ,

thanks for your interest 🤗

I did document the training commands in this cheatsheet.

But if you just want to fine-tune the language model, I think you can also use the Transformers library. More precisely, you could use the steps documented here:

https://github.com/huggingface/transformers/tree/master/examples#language-model-training

:)

cbalkig commented 4 years ago

Thanx Stefan again and again. My MS Thesis is related to BERT in Turkish. If it is OK for you, I'll point your project also. :)

stefan-it commented 4 years ago

Hi @balki7 ,

sure, that would be awesome :)

Just a few upcoming news: uncased BERT model is coming soon. And I've trained cased and uncased models with a larger vocab size (128k instead of 32k). Only evaluation is missing.