eg-nlp-community / nlp-reading-group

12 stars 0 forks source link

[27/07/2020] Monday 7:00 PM GMT+2 What Kind of Language Is Hard to Language-Model? ACL19 #22

Closed hadyelsahar closed 3 years ago

hadyelsahar commented 3 years ago

What Kind of Language Is Hard to Language-Model? ACL19 https://arxiv.org/pdf/1906.04726.pdf

Are there some types of language that are easier to model with current methods? In prior work (Cotterell et al., 2018) we attempted to address this question for language modeling, and observed that recurrent neural network language models do not perform equally well over all the highresource European languages found in the Europarl corpus. We speculated that inflectional morphology may be the primary culprit for the discrepancy. In this paper, we extend these earlier experiments to cover 69 languages from 13 language families using a multilingual Bible corpus.

hadyelsahar commented 3 years ago

IF you have hard time following the concepts in the paper, I recommend this blogpost as an introduction https://thegradient.pub/understanding-evaluation-metrics-for-language-models/

hadyelsahar commented 3 years ago

a follow up paper:

It's Easier to Translate out of English than into it: Measuring Neural Translation Difficulty by Cross-Mutual Information https://arxiv.org/abs/2005.02354