Check whether off-the-shelf xlm-roberta (that's more or less just roberta trained on the larger xlm training data in 100 languages, or more languages as the automatic language filter will have classified some data in other languages as belonging to 1 of the 100), performs better in our downstream tasks than Irish-specific ga_bert.
https://peltarion.com/blog/data-science/a-deep-dive-into-multilingual-nlp-models suggests "that training monolingual models for small languages is unnecessary" as "XLM-R achieved ~80% accuracy whereas the Swedish BERT models reached ~79% accuracy".
Check whether off-the-shelf xlm-roberta (that's more or less just roberta trained on the larger xlm training data in 100 languages, or more languages as the automatic language filter will have classified some data in other languages as belonging to 1 of the 100), performs better in our downstream tasks than Irish-specific ga_bert.
There are two models: base and large.