derlem / kanarya

A deep learning model for classification of 'de/da' clitics in Turkish
21 stars 3 forks source link

Flair Training with BERT Multilingual #19

Closed haozturk closed 4 years ago

haozturk commented 4 years ago

Apart from training BERT with custom data, we also use pretrained BERT multilingual models to create BERT embeddings. Our aim is to see how much BERT Multilingual contributes to our model and compare it with the BERT embeddings created with custom data.

haozturk commented 4 years ago

Currently we have a model which was trained with BERT Multilingual, GLoVe, FastText and Word2Vec embeddings for nearly 90 epochs. @alperen-degirmenci-2017400255 could give more detailed results. We have paused this training to make a flair training with custom BERT embeddings. So changing the status as blocked for now.

alperen-degirmenci-2017400255 commented 4 years ago

This model was trained with the model trainer of the Flair NLP framework. Following parameters used in the training: Batch size: 16 Maximum epochs: 100 Learning rate: 0.1 Hidden size: 128 RNN layers: 2

Training results were as following at the 65th epoch: precision: 0.8786 - recall: 0.8056 - accuracy: 0.7250 - f1-score: 0.8405 Loss: 0.6661

Model at this checkpoint correctly finds the errors of zor-cumleler.txt at 72/100 accuracy.

Training results were as following at the 90th epoch: precision: 0.9081 - recall: 0.7900 - accuracy: 0.7315 - f1-score: 0.8449 Loss: 0.5906

Model at this checkpoint correctly finds the errors of zor-cumleler.txt at 74/100 accuracy.