nlpaueb / greek-bert

A Greek edition of BERT pre-trained language model
MIT License
141 stars 10 forks source link

Deaccent - Lower #3

Closed soutsios closed 2 years ago

soutsios commented 3 years ago

Is there really a need to Pre-process text (Deaccent - Lower) as described in https://github.com/nlpaueb/greek-bert#pre-process-text-deaccent---lower since its already something that bert tokenizer does (https://github.com/google-research/bert#tokenization) ?

iliaschalkidis commented 2 years ago

No it's not. The tokenizer does it automatically. Sorry for the belated response...