rtapiaoregui / collocater

Spacy integrable pipeline component to identify collocations in text
MIT License
5 stars 3 forks source link

Loader() fails - possibly due to virtual environment or jupyter notebook? Or pickles? #2

Closed ChasNelson1990 closed 4 years ago

ChasNelson1990 commented 4 years ago

Hi there, I just found your module and thought I'd give it a go so I installed it into my spacy virtual environment with pipenv install collocater, loaded up a new Jupyter Notebook, copy and pasted your example code from the README.md and hit run. I got the below error.

So, I don't know enough about what's going on in your code but there seem to be three obvious sources that could cause problems here?

  1. It looks like the system falls down when trying to unpickle a file. Could it be that the code is searching for the file in a system path instead of a virtualenv path?
  2. And could this be done to the way I have my jupyter lab configured? I.e. my jupyter ecosystem is installed globally and I associate new virtualenvs with the ipykernel module - I know that occasionally this causes issues with magic commands where Jupyter assumes it should be looking in the system path, it seems unlikely but could something similar be occurring here?
  3. How did you pickle these files? I remember there being some issues going between windows/linux or Py2/Py3?

Happy to back and forth any ideas.

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-1-f67211300592> in <module>
      3 from pprint import pprint
      4 
----> 5 collie = collocater.Collocater.loader()
      6 nlp = spacy.load('en_core_web_sm')
      7 nlp.add_pipe(collie)

~/.local/share/virtualenvs/novel-language-processing-0RQ7FeDi/lib/python3.8/site-packages/collocater/collocater.py in loader(path)
    143 
    144         if not path:
--> 145             obj = joblib.load(pkr.resource_stream(__name__, 'data/collocater_obj.joblib'))
    146         else:
    147             with open(path, 'rb') as fh:

~/.local/share/virtualenvs/novel-language-processing-0RQ7FeDi/lib/python3.8/site-packages/joblib/numpy_pickle.py in load(filename, mmap_mode)
    573         filename = getattr(fobj, 'name', '')
    574         with _read_fileobject(fobj, filename, mmap_mode) as fobj:
--> 575             obj = _unpickle(fobj)
    576     else:
    577         with open(filename, 'rb') as f:

~/.local/share/virtualenvs/novel-language-processing-0RQ7FeDi/lib/python3.8/site-packages/joblib/numpy_pickle.py in _unpickle(fobj, filename, mmap_mode)
    502     obj = None
    503     try:
--> 504         obj = unpickler.load()
    505         if unpickler.compat_mode:
    506             warnings.warn("The file '%s' has been generated with a "

/usr/lib64/python3.8/pickle.py in load(self)
   1208                     raise EOFError
   1209                 assert isinstance(key, bytes_types)
-> 1210                 dispatch[key[0]](self)
   1211         except _Stop as stopinst:
   1212             return stopinst.value

/usr/lib64/python3.8/pickle.py in load_global(self)
   1524         module = self.readline()[:-1].decode("utf-8")
   1525         name = self.readline()[:-1].decode("utf-8")
-> 1526         klass = self.find_class(module, name)
   1527         self.append(klass)
   1528     dispatch[GLOBAL[0]] = load_global

/usr/lib64/python3.8/pickle.py in find_class(self, module, name)
   1579             return _getattribute(sys.modules[module], name)[0]
   1580         else:
-> 1581             return getattr(sys.modules[module], name)
   1582 
   1583     def load_reduce(self):

AttributeError: module '__main__' has no attribute 'Collocater'
rtapiaoregui commented 4 years ago

Hello, thanks for taking the time to report this!

I believe I found the issue, which was linked to the generation of a pickle file from within collocater.py being run as a script, which was wrong (see https://stackoverflow.com/questions/49621169/joblib-load-main-attributeerror). I have committed the changes both here and to version 0.3 of the module in PyPI, I would be extremely grateful if you could try again and report if my update solved it.

Thanks again!

ChasNelson1990 commented 4 years ago

Great, thanks for the speedy response. Yes, that seems to solve it :-)