Closed skmalviya closed 4 years ago
I have gone through the Google-BERT model training from scratch. I found that BERT requires consecutively related sentences e.g
sentence 1 sentence 2
it uses next-sentence prediction as a core method of training!! But in AI4Bharat-IndicNLP Hindi corpus consecutive sentences are mostly unrelated.
If it is not for BERT training then what would be your suggestion to start with to train a good Hindi BERT-Model.
we have recently released a BERT model for Indian languages - https://indicnlp.ai4bharat.org/
I have gone through the Google-BERT model training from scratch. I found that BERT requires consecutively related sentences e.g
it uses next-sentence prediction as a core method of training!! But in AI4Bharat-IndicNLP Hindi corpus consecutive sentences are mostly unrelated.
If it is not for BERT training then what would be your suggestion to start with to train a good Hindi BERT-Model.