Since the recent improvements in BERT handling I am wondering how useful the char rnn and the fasttext embeddings really are. I have some results suggesting that for modern French, where we have good BERT models, they don't really help. Systematic ablation studies (with at least a few trial runs with different seeds to get some semblance of statistics) are needed to get more insights in that direction. If this proves true, we could consider making the parser more modular to allow removing those lexers.
(Maybe non-contextual word embeddings too? Also we should definitely consider the case of Old French and the relevance of these lexers when a BERT lexer is not completely reliable.)
Since the recent improvements in BERT handling I am wondering how useful the char rnn and the fasttext embeddings really are. I have some results suggesting that for modern French, where we have good BERT models, they don't really help. Systematic ablation studies (with at least a few trial runs with different seeds to get some semblance of statistics) are needed to get more insights in that direction. If this proves true, we could consider making the parser more modular to allow removing those lexers.
(Maybe non-contextual word embeddings too? Also we should definitely consider the case of Old French and the relevance of these lexers when a BERT lexer is not completely reliable.)