shlomihod / deep-text-eval

Differnable Readability Measure Regularizer for Neural Network Automatic Text Simplification
24 stars 7 forks source link

Predicting the Readability of Short Web Summaries #6

Closed shlomihod closed 6 years ago

shlomihod commented 6 years ago

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.168.26&rep=rep1&type=pdf

vageeshSaxena commented 6 years ago

Reading

vageeshSaxena commented 6 years ago

1) Objective : To display a short summary of each web page in the result set for a better web-search engine. A) To see if there is any correlation between the predicted judgment values and the true human judgment. B) Understand which features are relatively more important for predicting readability values. C) Investigate the nature of issues that lead to disagreements between the predicted and true judgment values. 2) Results are presented on approximately 5000 editorial judgments collected over the course of a year and show examples where the model predicts the quality well and where it disagrees with human judgments.These results are then compared to the results of previous models of readability, most notably Collins-Thompson-Callan, Fog and Flesch-Kincaid, and see that our model shows substantially better correlation with editorial judgments as measured by Pearson’s correlation coefficient 3) Current Methodology : A) Queries are first sampled randomly from a weekly query log. B) The queries are then issued to various search engines and top-k (usually k is 10) results collected. 4) Steps Followed: A) Collected the corpus of web search result summaries using the current methodology. B) Extracted various features by using stochastic gradient boosted decision trees (GBDT) from the summaries and model the judgments as function of the features. C) This model is then used for rating a collection of new search result summaries from 1-5 where 1 is poor and 5 is good. 5) Features used for regression approach to modeling summary readability: A) FOG: This is a readability measure based on features such as average number of syllables per word. It is a fixed linear formula that is computed from features.The weights are fixed. B) Flesch: This metric is similar to FOG(pointing to some other article) C) Flesch-Kincaid: This is another long prose metric similar to FOG and Flesch(pointing to the same article as point B) D) Average Characters Per Word. E) Average Syllables Per Word. F) Percentage of Complex Words - is a feature used by Flesch. G) Number of Snippets : No of fragments in an abstract. H) Does beginning ellipses impact readability? I) Does ending ellipses impact readability? J) Capital Letters Fraction : Should not be much, otherwise its a SPAM. I) Punctuation Character Fraction : Should not be much, otherwise its a SPAM. J) Stop Word Fraction : Multiple occurrences of the keywords is a surrogate for a real language model. K) Query Word Hit Fraction 6) Results : Promising.

Score Type Pearson’s coefficient Fog 0.01572242 Kincaid -0.02689905 Flesch-Kincaid 0.02323278 Linear -0.001198311 Collins-Thompson-Callan 0.0597