Open MathrewLing opened 4 years ago
@MathrewLing each scorer explicitly checks that there are more than 100 words. You could remove this limitation by adding an option to skip the check. The new option would need to be checked per scorer.
See the following: https://github.com/cdimascio/py-readability-metrics/blob/master/readability/scorers/dale_chall.py#L17
If you are up for making the code changes I’ll certainly review it and work with u to merge it in
note that using fewer than 100 words may not provide a sufficient signal and hence the accuracy of the result will suffer
Thanks for raising the question that the code should do something to measure the readability of sentence whose words are less than 100. Here is what I would do: For a sentence, or a document which contains fewer than 25 words, raise an error message and the function does not return a result, based on your reasoning that the readability levels would severely suffered. However, for a document containing between 25 and 100 words, how about making the function return the result, while raising a message to warn the users that "Document with fewer than 100 words may not provide sufficient signal to the algorithm and hence the accuracy of the result will suffer"?
Thank you every much!!!