hopsparser / hopsparser

A neural dependency parser that does its best
https://hopsparser.readthedocs.io
Other
15 stars 10 forks source link

Run ablation studies on char and fasttext when using BERT #22

Closed LoicGrobol closed 3 years ago

LoicGrobol commented 3 years ago

Since the recent improvements in BERT handling I am wondering how useful the char rnn and the fasttext embeddings really are. I have some results suggesting that for modern French, where we have good BERT models, they don't really help. Systematic ablation studies (with at least a few trial runs with different seeds to get some semblance of statistics) are needed to get more insights in that direction. If this proves true, we could consider making the parser more modular to allow removing those lexers.

(Maybe non-contextual word embeddings too? Also we should definitely consider the case of Old French and the relevance of these lexers when a BERT lexer is not completely reliable.)

LoicGrobol commented 3 years ago

Some of these questions are answered in Grobol and Crabbé (2021) : Analyse en dépendances du français avec des plongements contextualisés