sebastianruder / NLP-progress

Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.
https://nlpprogress.com/
MIT License
22.67k stars 3.62k forks source link

Maybe we should add readability assessment task, too? #533

Open brucewlee opened 3 years ago

brucewlee commented 3 years ago

There had been both meaningful neural and non-neural approaches to this task. The linguistic features developed in this field can often be transferred to the other areas of text classification.

We shouldn't add this task inside text classification hence readability assessment tends to focus on "measuring" the difficulty of a text. Similar classification prediction models (SVM, HAN, etc.) are frequently used in readability assessment but I believe that the goal is different.

sebastianruder commented 3 years ago

Good idea. Could this be part of a section on "Automated assessment of written text" or something along those lines, see for example (Yannakoudakis and Briscoe, 2004)?

brucewlee commented 3 years ago

"Automated assessment of written text" is a great paper. A more recent example would be SOTA, non-neural 2016 and 'SOTA, neural 2020'.

Readability assessment is traditionally a very handcrafted feature-dependent task. SOTA models tend to be neural network-based models, but more traditional ones use SVM and ~100 linguistic features. There can be some useful insightful insights from the traditional models as well.