wroberts / fsed

Aho-Corasick string replacement utility
MIT License
23 stars 6 forks source link

code is trying to get length of generator object #1

Closed vi3k6i5 closed 5 years ago

vi3k6i5 commented 7 years ago
from fsed.fsed import build_trie
from fsed.fsed import rewrite_str_with_trie
from __future__ import unicode_literals

word_separation = True
word_encoding = 'utf-8'
slow_match = True

trie, boundaries = build_trie('synonyms.txt', 'sed', 'utf-8', word_separation)

line = unclean_doc.decode(word_encoding).rstrip('\n').lower()
line = rewrite_str_with_trie(line, trie, boundaries, slow_match)

error message:

TypeError                                 Traceback (most recent call last)
<ipython-input-6-daf73fb3cd97> in <module>()
     12     print(datetime.now(), doc)
     13     return unclean_doc, doc
---> 14 print(process_doc(doc))

<ipython-input-6-daf73fb3cd97> in process_doc(doc)
      8     print(datetime.now(), unclean_doc)
      9     line = unclean_doc.decode(word_encoding).rstrip('\n').lower()
---> 10     line = rewrite_str_with_trie(line, trie, boundaries, slow_match)
     11     doc = (line).encode(word_encoding)
     12     print(datetime.now(), doc)

/home/vikash/anaconda3/envs/python2.7/lib/python2.7/site-packages/fsed/fsed.pyc in rewrite_str_with_trie(sval, trie, boundaries, slow)
    173         sval = fsed.ahocorasick.boundary_transform(sval)
    174     if slow:
--> 175         sval = trie.replace(sval)
    176     else:
    177         sval = trie.greedy_replace(sval)

/home/vikash/anaconda3/envs/python2.7/lib/python2.7/site-packages/fsed/ahocorasick.pyc in replace(self, seq)
    282         #    each cell gets assigned (0, char), where char is the character at
    283         #    the corresponding position in the input string
--> 284         chart = [ [None for _i in range(len(seq)) ] for _i in range(len(seq)) ]
    285         chart[0] = [(0, char) for char in seq]
    286         # now we fill in the chart using the results from the aho-corasick

TypeError: object of type 'generator' has no len()
wroberts commented 7 years ago

Good catch, thanks. I almost never use --slow or replace because it takes so long to run. This might be fixed with v0.5.3, I hope. I hope you're able to test and confirm here that your issue is resolved. Thanks again for the bug report!

vi3k6i5 commented 7 years ago

Thanks for listening. Will wait for the bug-fix and definitely help with testing however I can :)

wroberts commented 7 years ago

The bug fix is 1efd5ef501fdc0db3b2327f08725759ce5a780cb and you should already be able to update to 0.5.3 with pip.