I'm doing some word lookups for portuguese and I got the following:
File "/home/intruder/source/tgalery/analytyca/analytyca/utils/context.py", line 9, in get_vector
vector = embeddings[word_key]
File "/usr/local/lib/python2.7/dist-packages/polyglot/mapping/embeddings.py", line 40, in __getitem__
return self.vectors[self.vocabulary[k]]
File "/usr/local/lib/python2.7/dist-packages/polyglot/mapping/expansion.py", line 29, in __getitem__
return self.approximate_ids(key)
File "/usr/local/lib/python2.7/dist-packages/polyglot/mapping/expansion.py", line 52, in approximate_ids
raise KeyError("{} not found".format(key))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe1' in position 5: ordinal not in range(128)
Because the string to be formated is not compatible with the incoming unicode key the Key Error throws another exception.
I'm happy to fix this, but I wonder whether the keys are meant to be in binary format for lookups.
Let me know how best to proceed.
I'm doing some word lookups for portuguese and I got the following:
Because the string to be formated is not compatible with the incoming unicode
key
the Key Error throws another exception.I'm happy to fix this, but I wonder whether the keys are meant to be in binary format for lookups. Let me know how best to proceed.