stefan-it / turkish-bert

Turkish BERT/DistilBERT, ELECTRA and ConvBERT models
494 stars 42 forks source link

Where is the model? #1

Closed gevezex closed 4 years ago

gevezex commented 4 years ago

So where can we download the TR trained model?

stefan-it commented 4 years ago

Hi @gevezex ,

training of the models (both cased and uncased) is currently in progress.

The cased model is now at 1.4M steps, uncased model at 0.5M steps.

I updated the README with links to the TensorBoards.

After evaluation (I plan to evaluate the models on PoS and NER datasets from here), I'll upload the models to the Huggin Face model hub.

I think training and evaluation of the cased model will be finished until Friday, so stay tuned!

gevezex commented 4 years ago

Great! Bert Multilingual performed very bad on that dataset, so I am very curious about your trained model.

TharinduDR commented 4 years ago

Hi @stefan-it ,

Thank you for the repo. Did you finish the training?

stefan-it commented 4 years ago

Hey guys,

the model is now uploaded to the Hugging Face model hub:

https://huggingface.co/dbmdz/bert-base-turkish-cased

You can directly use it in the Transformers library :)

Uncased model is coming next week (training has finished, only evaluation is missing).

serdar-eric commented 4 years ago

Hi @stefan-it . Any updates about the uncased model?

stefan-it commented 4 years ago

Hi @serdar-eric ,

uncased model will be released this week. Additionally, I also trained cased and uncased models with a larger vocab size (128k instead of 32k), I'm currently doing the evaluation, but I think it will be ready until Sunday :)

serdar-eric commented 4 years ago

Hi @serdar-eric ,

uncased model will be released this week. Additionally, I also trained cased and uncased models with a larger vocab size (128k instead of 32k), I'm currently doing the evaluation, but I think it will be ready until Sunday :)

It's great to hear that @stefan-it . Thank you very much

stefan-it commented 4 years ago

Hi @serdar-eric ,

the uncased model is now available from model hub: https://huggingface.co/dbmdz/bert-base-turkish-uncased.

Additionally, you can now find models with a larger vocab size (128k) on the model hub:

serdar-eric commented 4 years ago

Thank you very much @stefan-it . Already started to download and work on it :)

cbalkig commented 4 years ago

Stefan, I need 128K TF model for comparing with 32K. Could u share? Thanx a lot! :)

cbalkig commented 4 years ago

@stefan-it Stefan, reminder. I've short time to submit the thesis. BTW, your Github repo link is on my paper references... :) Thanx a lot.

stefan-it commented 4 years ago

@balki7 no problem, I'm currently preparing the archives :)

stefan-it commented 4 years ago

Download links are now available:

# sha256: 85977347ef031c5cf5f41bd79342e1548b119a1c5b1e7c5dcc1cae4855c4d2b1
wget https://schweter.eu/cloud/bert-base-turkish-128k-cased/bert-base-turkish-128k-cased-tf.tar.gz

# sha256: b73916ee54a25b1a75df87f55dea4adbc8b61b6f82ef0983e556b1007e3be3df
wget https://schweter.eu/cloud/bert-base-turkish-128k-uncased/bert-base-turkish-128k-uncased-tf.tar.gz

Please let me know if the checkpoints are working + good luck with your thesis :)

cbalkig commented 4 years ago

Çok teşekkür ederim Stefan... :)