cdimascio / py-readability-metrics

📗 Score text readability using a number of formulas: Flesch-Kincaid Grade Level, Gunning Fog, ARI, Dale Chall, SMOG, and more
https://py-readability-metrics.readthedocs.io/en/latest/
MIT License
349 stars 58 forks source link

How to modify the code to measure the readability of sentence whose words less than 100 #14

Open MathrewLing opened 4 years ago

MathrewLing commented 4 years ago

Thank you every much!!!

cdimascio commented 4 years ago

@MathrewLing each scorer explicitly checks that there are more than 100 words. You could remove this limitation by adding an option to skip the check. The new option would need to be checked per scorer.

See the following: https://github.com/cdimascio/py-readability-metrics/blob/master/readability/scorers/dale_chall.py#L17

If you are up for making the code changes I’ll certainly review it and work with u to merge it in

cdimascio commented 4 years ago

note that using fewer than 100 words may not provide a sufficient signal and hence the accuracy of the result will suffer

son520804 commented 3 years ago

Thanks for raising the question that the code should do something to measure the readability of sentence whose words are less than 100. Here is what I would do: For a sentence, or a document which contains fewer than 25 words, raise an error message and the function does not return a result, based on your reasoning that the readability levels would severely suffered. However, for a document containing between 25 and 100 words, how about making the function return the result, while raising a message to warn the users that "Document with fewer than 100 words may not provide sufficient signal to the algorithm and hence the accuracy of the result will suffer"?