Trying to predict text difficulty according to language proficiency levels (A1 to C2)
Many hand-crafted features (406!) given as features for an SVM, feature groups:
Lexical features
Syntactic features
Semantic features
Features specific to French
Good paper for the other group because they compare the predictive power of different features
Lexical and syntactic features had biggest, semantic feature had no significant predictive power. The French-spedific features only had a minor impact.
Best model (SVM trained with the most predictive features) had only 49% accuracy
http://www.aclweb.org/anthology/D12-1043