PetrochukM / PyTorch-NLP

Basic Utilities for PyTorch Natural Language Processing (NLP)
https://pytorchnlp.readthedocs.io
BSD 3-Clause "New" or "Revised" License
2.21k stars 258 forks source link

OSError on second usage of FastText() #96

Closed rodinasophie closed 4 years ago

rodinasophie commented 4 years ago

OS: Windows 10 pip installation using Anaconda prompt

Expected Behavior

from torchnlp.word_to_vector import FastText
vectors = FastText()

Once the vectors are downloaded to disk, uploading FastText() model to the memory always works.

Actual Behavior

On the second upload of the model the following error is received:

OSError                                   Traceback (most recent call last)
<ipython-input-15-b2b205de2be4> in <module>
----> 1 vectors = FastText()

~\Anaconda3\lib\site-packages\torchnlp\word_to_vector\fast_text.py in __init__(self, language, aligned, **kwargs)
     81             url = self.url_base.format(language)
     82         name = os.path.basename(url)
---> 83         super(FastText, self).__init__(name, url=url, **kwargs)

~\Anaconda3\lib\site-packages\torchnlp\word_to_vector\pretrained_word_vectors.py in __init__(self, name, cache, url, unk_init, is_include)
     70         self.is_include = is_include
     71         self.name = name
---> 72         self.cache(name, cache, url=url)
     73 
     74     def __contains__(self, token):

~\Anaconda3\lib\site-packages\torchnlp\word_to_vector\pretrained_word_vectors.py in cache(self, name, cache, url)
    175         else:
    176             logger.info('Loading vectors from {}'.format(path_pt))
--> 177             self.index_to_token, self.token_to_index, self.vectors, self.dim = torch.load(path_pt)

~\Anaconda3\lib\site-packages\torch\serialization.py in load(f, map_location, pickle_module, **pickle_load_args)
    384         f = f.open('rb')
    385     try:
--> 386         return _load(f, map_location, pickle_module, **pickle_load_args)
    387     finally:
    388         if new_fd:

~\Anaconda3\lib\site-packages\torch\serialization.py in _load(f, map_location, pickle_module, **pickle_load_args)
    578     for key in deserialized_storage_keys:
    579         assert key in deserialized_objects
--> 580         deserialized_objects[key]._set_from_file(f, offset, f_should_read_directly)
    581         if offset is not None:
    582             offset = f.tell()

OSError: [Errno 22] Invalid argument

Steps to Reproduce the Problem

  1. Install torchnlp, download .word_vectors_cache:
    from torchnlp.word_to_vector import FastText
    vectors = FastText()
  2. Rerun cell/kernels of Jupyter notebook and rerun the cell with code above
PetrochukM commented 4 years ago

I was unable to reproduce the problem: image