mailgun / talon

Apache License 2.0
1.27k stars 285 forks source link

Cannot do talon.init() in Python 3 #117

Closed astrojuanlu closed 7 years ago

astrojuanlu commented 8 years ago

talon is shipping with a pickle file generated in Python 2:

https://github.com/mailgun/talon/blob/v1.3.2/talon/signature/data/classifier

which fails to unpickle on Python 3:

$ python --version
Python 3.5.2 :: Continuum Analytics, Inc.
$ python -c "import talon; talon.init()"
Traceback (most recent call last):
  File "/home/jlcano/.miniconda3/envs/py3/lib/python3.5/site-packages/sklearn/externals/joblib/numpy_pickle.py", line 507, in _unpickle
    obj = unpickler.load()
  File "/home/jlcano/.miniconda3/envs/py3/lib/python3.5/pickle.py", line 1039, in load
    dispatch[key[0]](self)
  File "/home/jlcano/.miniconda3/envs/py3/lib/python3.5/pickle.py", line 1219, in load_short_binstring
    self.append(self._decode_string(data))
  File "/home/jlcano/.miniconda3/envs/py3/lib/python3.5/pickle.py", line 1159, in _decode_string
    return value.decode(self.encoding, self.errors)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xbe in position 0: ordinal not in range(128)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/jlcano/.miniconda3/envs/py3/lib/python3.5/site-packages/talon/__init__.py", line 13, in init
    signature.initialize()
  File "/home/jlcano/.miniconda3/envs/py3/lib/python3.5/site-packages/talon/signature/__init__.py", line 39, in initialize
    EXTRACTOR_DATA)
  File "/home/jlcano/.miniconda3/envs/py3/lib/python3.5/site-packages/talon/signature/learning/classifier.py", line 32, in load
    return joblib.load(saved_classifier_filename)
  File "/home/jlcano/.miniconda3/envs/py3/lib/python3.5/site-packages/sklearn/externals/joblib/numpy_pickle.py", line 575, in load
    obj = _unpickle(fobj, filename, mmap_mode)
  File "/home/jlcano/.miniconda3/envs/py3/lib/python3.5/site-packages/sklearn/externals/joblib/numpy_pickle.py", line 522, in _unpickle
    raise new_exc
ValueError: You may be trying to read with python 3 a joblib pickle generated with python 2. This feature is not supported by joblib.
mnba commented 8 years ago

Confirm this too. While python2 version work sometimes (now signature extraction is broken), seems python 3 is not supported at all.

obukhov-sergey commented 7 years ago

Duplicate of https://github.com/mailgun/talon/issues/42