mholtzscher / spacy_readability

spaCy pipeline component for adding text readability meta data to Doc objects.
MIT License
56 stars 10 forks source link

ZeroDivisionError: when doc has no words #111

Closed garywu closed 5 years ago

garywu commented 5 years ago

Description

Describe what you were trying to get done. Tell us what happened, what went wrong, and what you expected to happen.

What I Did

Use sample code. instead of original text use '#' as text for spaCy doc.

import spacy
from spacy_readability import Readability

nlp = spacy.load('en')
read = Readability()
nlp.add_pipe(read, last=True)

doc = nlp("#")

print(doc._.flesch_kincaid_grade_level)
print(doc._.flesch_kincaid_reading_ease)
print(doc._.dale_chall)
print(doc._.smog)
print(doc._.coleman_liau_index)
print(doc._.automated_readability_index)
$ python tests/read.py
0
0
0
0
Traceback (most recent call last):
  File "tests/read.py", line 14, in <module>
    print(doc._.coleman_liau_index)
  File "/usr/local/anaconda3/envs/multil/lib/python3.7/site-packages/spacy/tokens/underscore.py", line 31, in __getattr__
    return getter(self._obj)
  File "/usr/local/anaconda3/envs/multi/lib/python3.7/site-packages/spacy_readability/__init__.py", line 103, in coleman_liau
    letters_to_words = letter_count / self.num_words * 100
ZeroDivisionError: division by zero
mholtzscher commented 5 years ago

Thanks for the detailed bug report. I confirmed the error and will try to get a fix out this week.