mailgun / talon

Apache License 2.0
1.26k stars 287 forks source link

Unable to use signature extraction library #218

Open ghost opened 3 years ago

ghost commented 3 years ago

c:\Development\iMemori>python src\test.py -i src\imem.properties Traceback (most recent call last): File "c:\Development\iMemori\src\test.py", line 6, in from talon import signature File "C:\Applications\Python\3.9.4\lib\site-packages\talon\signature__init.py", line 28, in from . learning import classifier File "C:\Applications\Python\3.9.4\lib\site-packages\talon\signature\learning\classifier.py", line 11, in from talon import joblib ImportError: cannot import name 'joblib' from 'talon' (C:\Applications\Python\3.9.4\lib\site-packages\talon\init__.py)

Editing above classifier.py library to use import joblib result in below error

Traceback (most recent call last): File "C:\Applications\Python\3.9.4\lib\site-packages\talon\signature\learning\classifier.py", line 34, in load return joblib.load(saved_classifier_filename) File "C:\Applications\Python\3.9.4\lib\site-packages\joblib\numpy_pickle.py", line 585, in load obj = _unpickle(fobj, filename, mmap_mode) File "C:\Applications\Python\3.9.4\lib\site-packages\joblib\numpy_pickle.py", line 504, in _unpickle obj = unpickler.load() File "C:\Applications\Python\3.9.4\lib\pickle.py", line 1212, in load dispatchkey[0] File "C:\Applications\Python\3.9.4\lib\pickle.py", line 1528, in load_global klass = self.find_class(module, name) File "C:\Applications\Python\3.9.4\lib\pickle.py", line 1579, in find_class import(module, level=0) ModuleNotFoundError: No module named 'sklearn.svm.classes'

KadenWolff commented 3 years ago

After you've updated the joblib import (which you've done), it's easiest to just retrain the model, which can be done with these lines. Either run it as a script or just run it directly in python interactive shell.

from talon.signature import EXTRACTOR_FILENAME, EXTRACTOR_DATA from talon.signature.learning.classifier import train, init train(init(), EXTRACTOR_DATA, EXTRACTOR_FILENAME)

There are a bunch of PRs to fix this like #219, hopefully one will be approved at some point.