rasbt / python-machine-learning-book

The "Python Machine Learning (1st edition)" book code repository and info resource
MIT License
12.24k stars 4.4k forks source link

Cant import using pickle ch9 #18

Closed favetelinguis closed 8 years ago

favetelinguis commented 8 years ago

When i try to read back the classifier on page 254 i get the following error. I have done like in the book the whole way and things have worked find until now. Any idea what has gone wrong?

Im using ipython 4.2.0

AttributeError                            Traceback (most recent call last)
<ipython-input-4-f050da95a5cf> in <module>()
----> 1 import codecs, os;__pyfile = codecs.open('''/var/folders/yh/mm1bdmx9073_b15lw69b2qmh0000gn/T/py71220g7y''', encoding='''utf-8''');__code = __pyfile.read().encode('''utf-8''');__pyfile.close();os.remove('''/var/folders/yh/mm1bdmx9073_b15lw69b2qmh0000gn/T/py71220g7y''');exec(compile(__code, '''/Users/henke/Documents/code/python/python-ml/movieclassifier/main.py''', 'exec'));

/Users/henke/Documents/code/python/python-ml/movieclassifier/main.py in <module>()
      4 from vectorizer import vect
      5 
----> 6 clf = pickle.load(open(os.path.join('pkl_objects', 'classifier.pkl'), 'rb'))
      7 
      8 import numpy as np

AttributeError: Can't get attribute 'tokenizer' on <module '__main__'>
rasbt commented 8 years ago

Sorry about the trouble, Henrik. I just re-ran this example and it works fine for me. I prepared a simpler example for debugging, using only 100 samples from the movie dataset. Maybe you could try execute them on your system so that we could maybe get a better idea of what's going on. Attached are 4 files, the small movie dataset, the vectorizer.py script, and the 2 scripts to execute. The first one, ch08_pickle-dump-test.py, creates the classifier and stopword pickle files, and the second one, python ch08_pickle-load-test.py, loads the vectorizer and the classifier to make a prediction. The files should be all in the same directory, e.g., just put them on your desktop or so.

When I execute the two files, I get the following ...

~/Desktop$ python ch08_pickle-dump-test.py 
~/Desktop$ python ch08_pickle-load-test.py 
Prediction: positive
Probability: 85.71%

Would be nice if you could check whether they also throw this AttributeError: Can't get attribute 'tokenizer' on <module '__main__'> so that we know more!

PS: Sorry, had to ZIP the files since GitHub complained about the attachment via a "Unfortunately, we don’t support that file type. Try again with a PNG, GIF, JPG, DOCX, PPTX, XLSX, TXT, PDF, or ZIP."). pickle-debugging-1.zip

Best, Sebastian

favetelinguis commented 8 years ago

Thanks for a fast reply, trying your new example works. And after changes to a new conda virtual env i could also get my code working, must have messed something up in the env.

rasbt commented 8 years ago

Glad to here that it was such an "easy" fix and not a deeper problem with the code itself :)