#!/usr/bin/env python3
import polyglot
from polyglot.text import Text, Word
word = Text("Preprocessing is an essential step.").words[0]
print(word.morphemes)
When I try to run it (after calling polyglot.downloader.downloader.download('morph2.en')):
Traceback (most recent call last):
File "./test.py", line 5, in <module>
print(word.morphemes)
File "/usr/local/lib/python3.4/dist-packages/polyglot/decorators.py", line 20, in __get__
value = obj.__dict__[self.func.__name__] = self.func(obj)
File "/usr/local/lib/python3.4/dist-packages/polyglot/text.py", line 286, in morphemes
words, score = self.morpheme_analyzer.viterbi_segment(self.string)
File "/usr/local/lib/python3.4/dist-packages/polyglot/decorators.py", line 20, in __get__
value = obj.__dict__[self.func.__name__] = self.func(obj)
File "/usr/local/lib/python3.4/dist-packages/polyglot/text.py", line 282, in morpheme_analyzer
return load_morfessor_model(lang=self.language)
File "/usr/local/lib/python3.4/dist-packages/polyglot/decorators.py", line 30, in memoizer
cache[key] = obj(*args, **kwargs)
File "/usr/local/lib/python3.4/dist-packages/polyglot/load.py", line 142, in load_morfessor_model
model = io.read_any_model(tmp_file_.name)
File "/usr/local/lib/python3.4/dist-packages/morfessor/io.py", line 203, in read_any_model
model.load_segmentations(self.read_segmentation_file(file_name))
File "/usr/local/lib/python3.4/dist-packages/morfessor/baseline.py", line 487, in load_segmentations
for count, segmentation in segmentations:
File "/usr/local/lib/python3.4/dist-packages/morfessor/io.py", line 53, in read_segmentation_file
for line in self._read_text_file(file_name):
File "/usr/local/lib/python3.4/dist-packages/morfessor/io.py", line 240, in _read_text_file
encoding = self._find_encoding(file_name)
File "/usr/local/lib/python3.4/dist-packages/morfessor/io.py", line 320, in _find_encoding
raise UnicodeError("Can not determine encoding of input files")
UnicodeError: Can not determine encoding of input files
Versions:
$ python3 --version
Python 3.4.2
$ pip3 show polyglot | grep Version
Version: 16.07.04
$ pip3 show morfessor | grep Version
Version: 2.0.1
So it seems that pip3 install morfessor won't pick the latest version. pip3 install 'morfessor>=2.0.2a1' (as per polyglot(1) warning) solved the issue.
From the tutorial:
When I try to run it (after calling
polyglot.downloader.downloader.download('morph2.en')
):Versions: