kermitt2 / delft

a Deep Learning Framework for Text https://delft.readthedocs.io/
Apache License 2.0
388 stars 64 forks source link

TypeError: can't pickle Environment objects on Windows/MacOs #14

Open fortepianissimo opened 6 years ago

fortepianissimo commented 6 years ago

I'm running under Windows 10, following along the instructions given by the readme document. When trying to retrain the model using this command

python nerTagger.py --dataset-type conll2003 train_eval

I ran into the following exception (right after compiling embeddings) - any tips?

Thank you for the wonderful work!

Compiling embeddings... (this is done only one time per embeddings at first launch)
path: d:\Projects\embeddings\glove.840B.300d.txt
100%|████████████████████████████████████████████████████████████████████| 2196017/2196017 [08:06<00:00, 4517.80it/s] embeddings loaded for 2196006 words and 300 dimensions
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
char_input (InputLayer)         (None, None, 30)     0
__________________________________________________________________________________________________
time_distributed_1 (TimeDistrib (None, None, 30, 25) 2150        char_input[0][0]
__________________________________________________________________________________________________
word_input (InputLayer)         (None, None, 300)    0
__________________________________________________________________________________________________
time_distributed_2 (TimeDistrib (None, None, 50)     10200       time_distributed_1[0][0]
__________________________________________________________________________________________________
concatenate_1 (Concatenate)     (None, None, 350)    0           word_input[0][0]
                                                                 time_distributed_2[0][0]
__________________________________________________________________________________________________
dropout_1 (Dropout)             (None, None, 350)    0           concatenate_1[0][0]
__________________________________________________________________________________________________
bidirectional_2 (Bidirectional) (None, None, 200)    360800      dropout_1[0][0]
__________________________________________________________________________________________________
dropout_2 (Dropout)             (None, None, 200)    0           bidirectional_2[0][0]
__________________________________________________________________________________________________
dense_1 (Dense)                 (None, None, 100)    20100       dropout_2[0][0]
__________________________________________________________________________________________________
dense_2 (Dense)                 (None, None, 10)     1010        dense_1[0][0]
__________________________________________________________________________________________________
chain_crf_1 (ChainCRF)          (None, None, 10)     120         dense_2[0][0]
==================================================================================================
Total params: 394,380
Trainable params: 394,380
Non-trainable params: 0
__________________________________________________________________________________________________
Epoch 1/60
Exception in thread Thread-2:
Traceback (most recent call last):
  File "d:\Anaconda3\Lib\threading.py", line 916, in _bootstrap_inner
    self.run()
  File "d:\Anaconda3\Lib\threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "d:\Projects\delft\env\lib\site-packages\keras\utils\data_utils.py", line 548, in _run
    with closing(self.executor_fn(_SHARED_SEQUENCES)) as executor:
  File "d:\Projects\delft\env\lib\site-packages\keras\utils\data_utils.py", line 522, in <lambda>
    initargs=(seqs,))
  File "d:\Anaconda3\Lib\multiprocessing\context.py", line 119, in Pool
    context=self.get_context())
  File "d:\Anaconda3\Lib\multiprocessing\pool.py", line 174, in __init__
    self._repopulate_pool()
  File "d:\Anaconda3\Lib\multiprocessing\pool.py", line 239, in _repopulate_pool
    w.start()
  File "d:\Anaconda3\Lib\multiprocessing\process.py", line 105, in start
    self._popen = self._Popen(self)
  File "d:\Anaconda3\Lib\multiprocessing\context.py", line 322, in _Popen
    return Popen(process_obj)
  File "d:\Anaconda3\Lib\multiprocessing\popen_spawn_win32.py", line 65, in __init__
    reduction.dump(process_obj, to_child)
  File "d:\Anaconda3\Lib\multiprocessing\reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
TypeError: can't pickle Environment objects
fortepianissimo commented 6 years ago

Okay - disabling lmdb in embedding-registry.json seems to make that exception go away. BUT now there's another exception:

__________________________________________________________________________________________________
Epoch 1/60
d:\Projects\delft\env\lib\site-packages\h5py\__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Using TensorFlow backend.
d:\Projects\delft\env\lib\site-packages\gensim\utils.py:1197: UserWarning: detected Windows; aliasing chunkize to chunkize_serial
  warnings.warn("detected Windows; aliasing chunkize to chunkize_serial")
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "d:\Anaconda3\Lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "d:\Anaconda3\Lib\multiprocessing\spawn.py", line 115, in _main
    self = reduction.pickle.load(from_parent)
  File "d:\Projects\delft\utilities\Embeddings.py", line 78, in __getattr__
    return getattr(self.model, name)
  File "d:\Projects\delft\utilities\Embeddings.py", line 78, in __getattr__
    return getattr(self.model, name)
  File "d:\Projects\delft\utilities\Embeddings.py", line 78, in __getattr__
    return getattr(self.model, name)
  [Previous line repeated 328 more times]
pjox commented 6 years ago

Hello! I haven't been able to reproduce the exception in Linux so it might be windows related. I'm trying to get a windows machine in order to try again. In the meanwhile, can you tell us a little more about your set-up? For instance, are you using a GPU? Did you use the requirements-gpu.txt files to set it up? Also, which version of python are you using?

Thanks!

fortepianissimo commented 6 years ago

Hi sorry I wasn't very clear about my spec:

fortepianissimo commented 6 years ago

By the way I also solved this error along the way: DLL load failed message when scikit-learn is imported.

The solution is to install numpy‑1.14.6+mkl‑cp36‑cp36m‑win_amd64.whl (depending on the arch and Python version) from https://www.lfd.uci.edu/~gohlke/pythonlibs/#numpy

pjox commented 6 years ago

Ok, we had some problems before with Python 3.6, I honestly don't think that the Python version is the problem, but if you have the time, can you try creating a Python 3.5 environment with conda conda create -n myenv python=3.5 and see if you encounter the same problems? As soon as I get to try DeLFT on Windows I'll get back to you.

fortepianissimo commented 6 years ago

Ok I set up Python 3.5 (version 3.5.6 via Anaconda) environment and created another env_python35 under delft dir, here are the errors (infinite recursion):

Epoch 1/60
D:\Projects\delft\env_python35\lib\site-packages\h5py\__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Using TensorFlow backend.
D:\Projects\delft\env_python35\lib\site-packages\gensim\utils.py:1197: UserWarning: detected Windows; aliasing chunkize to chunkize_serial
  warnings.warn("detected Windows; aliasing chunkize to chunkize_serial")
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "d:\Anaconda3\envs\python35_env\Lib\multiprocessing\spawn.py", line 106, in spawn_main
    exitcode = _main(fd)
  File "d:\Anaconda3\envs\python35_env\Lib\multiprocessing\spawn.py", line 116, in _main
    self = pickle.load(from_parent)
  File "D:\Projects\delft\utilities\Embeddings.py", line 78, in __getattr__
    return getattr(self.model, name)
  File "D:\Projects\delft\utilities\Embeddings.py", line 78, in __getattr__
    return getattr(self.model, name)
  File "D:\Projects\delft\utilities\Embeddings.py", line 78, in __getattr__
    return getattr(self.model, name)
  File "D:\Projects\delft\utilities\Embeddings.py", line 78, in __getattr__
... (more same lines like the above) ...
RecursionError: maximum recursion depth exceeded while calling a Python object
Exception in thread Thread-1:
Traceback (most recent call last):
  File "d:\Anaconda3\envs\python35_env\Lib\threading.py", line 914, in _bootstrap_inner
    self.run()
  File "d:\Anaconda3\envs\python35_env\Lib\threading.py", line 862, in run
    self._target(*self._args, **self._kwargs)
  File "D:\Projects\delft\env_python35\lib\site-packages\keras\utils\data_utils.py", line 548, in _run
    with closing(self.executor_fn(_SHARED_SEQUENCES)) as executor:
  File "D:\Projects\delft\env_python35\lib\site-packages\keras\utils\data_utils.py", line 522, in <lambda>
    initargs=(seqs,))
  File "d:\Anaconda3\envs\python35_env\Lib\multiprocessing\context.py", line 118, in Pool
    context=self.get_context())
  File "d:\Anaconda3\envs\python35_env\Lib\multiprocessing\pool.py", line 174, in __init__
    self._repopulate_pool()
  File "d:\Anaconda3\envs\python35_env\Lib\multiprocessing\pool.py", line 239, in _repopulate_pool
    w.start()
  File "d:\Anaconda3\envs\python35_env\Lib\multiprocessing\process.py", line 105, in start
    self._popen = self._Popen(self)
  File "d:\Anaconda3\envs\python35_env\Lib\multiprocessing\context.py", line 313, in _Popen
    return Popen(process_obj)
  File "d:\Anaconda3\envs\python35_env\Lib\multiprocessing\popen_spawn_win32.py", line 66, in __init__
    reduction.dump(process_obj, to_child)
  File "d:\Anaconda3\envs\python35_env\Lib\multiprocessing\reduction.py", line 59, in dump
    ForkingPickler(file, protocol).dump(obj)
BrokenPipeError: [Errno 32] Broken pipe
pjox commented 6 years ago

Thanks for the info! I have been looking around and apparently the multiprocessing library works differently on Windows, so this series of errors you are encountering might be caused by that. However I haven't been able to find a Windows machine to test it yet, as soon as I can get hold of one I'll get back to you.

pjox commented 6 years ago

@fortepianissimo I finally got hold of a Windows machine and was able to reproduce the error, could you please comment lines 77 and 78 in the file utilities/Embeddings.py, that is, these lines:

def __getattr__(self, name):
    return getattr(self.model, name)

and try again?

Note 1: Please also disable lmdb in embedding-registry.json Note 2: This is a workaround rather than a fix, I'll work on a definite fix in the future

Also, please let me know if the workaround works!

ghost commented 5 years ago

Hello, I'm new to this. My specs are:

And I want to ask for 2 things:

Using TensorFlow backend.
D:\Anaconda3\envs\ULR\lib\site-packages\gensim\utils.py:1197: UserWarning: detected Windows; aliasing chunkize to chunkize_serial
  warnings.warn("detected Windows; aliasing chunkize to chunkize_serial")
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "D:\Anaconda3\envs\ULR\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "D:\Anaconda3\envs\ULR\lib\multiprocessing\spawn.py", line 115, in _main
    self = reduction.pickle.load(from_parent)
EOFError: Ran out of input

Edited: For the first question, I've found the answer (set the "embedding-lmdb-path" to "None")

davidlenz commented 5 years ago

I face the same issue as @Protossnam EOFError: Ran out of input. Am on Windows 10 with py3.5. Any updates on this?

ghost commented 5 years ago

@davidlenz Sadly, I had to boot my laptop in Linux (Ubuntu) and run the tool. On Linux, I didn't face that issue. It's maybe the problem with Windows and I also looking forward to hearing new update on this too

oterrier commented 4 years ago

Hi all, An easy workaround would be to disable multiprocessing when running on Windows To do that you need to pass multiprocessing=False each time a new Sequence object in created in nerTagger.py

My 2 cts

Olivier

lfoppiano commented 2 years ago

I have this issue when the download fails and the database is not correctly initialised I supposed:

Exception in thread Thread-1:
Traceback (most recent call last):
  File "/Users/lfoppiano/opt/anaconda3/envs/delft2/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/Users/lfoppiano/opt/anaconda3/envs/delft2/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/lfoppiano/opt/anaconda3/envs/delft2/lib/python3.8/site-packages/keras/utils/data_utils.py", line 744, in _run
    with closing(self.executor_fn(_SHARED_SEQUENCES)) as executor:
  File "/Users/lfoppiano/opt/anaconda3/envs/delft2/lib/python3.8/site-packages/keras/utils/data_utils.py", line 721, in pool_fn
    pool = get_pool_class(True)(
  File "/Users/lfoppiano/opt/anaconda3/envs/delft2/lib/python3.8/multiprocessing/context.py", line 119, in Pool
    return Pool(processes, initializer, initargs, maxtasksperchild,
  File "/Users/lfoppiano/opt/anaconda3/envs/delft2/lib/python3.8/multiprocessing/pool.py", line 212, in __init__
    self._repopulate_pool()
  File "/Users/lfoppiano/opt/anaconda3/envs/delft2/lib/python3.8/multiprocessing/pool.py", line 303, in _repopulate_pool
    return self._repopulate_pool_static(self._ctx, self.Process,
  File "/Users/lfoppiano/opt/anaconda3/envs/delft2/lib/python3.8/multiprocessing/pool.py", line 326, in _repopulate_pool_static
    w.start()
  File "/Users/lfoppiano/opt/anaconda3/envs/delft2/lib/python3.8/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
  File "/Users/lfoppiano/opt/anaconda3/envs/delft2/lib/python3.8/multiprocessing/context.py", line 284, in _Popen
    return Popen(process_obj)
  File "/Users/lfoppiano/opt/anaconda3/envs/delft2/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/Users/lfoppiano/opt/anaconda3/envs/delft2/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/Users/lfoppiano/opt/anaconda3/envs/delft2/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 47, in _launch
    reduction.dump(process_obj, fp)
  File "/Users/lfoppiano/opt/anaconda3/envs/delft2/lib/python3.8/multiprocessing/reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
TypeError: cannot pickle 'Environment' object

Update: I tried to run again and the database was correctly created (via a local version of glove), however the problem occurs probably due to multithreading...

To reproduce it I used:

python -m delft.applications.citationClassifier train_eval

Update: I'm having this problem with macOS.

lfoppiano commented 2 years ago

I have the same problem with MacOs.

The solution is to disable the multithreading by setting nb_workers = 0. Depending on the task to be performed it should modified in both sequenceLabelling/wrapper.py and trainer.py: 172.