Open wan-wei opened 5 years ago
can confirm the issue
same error here
same error here, when try 1 GPU,the error like Traceback (most recent call last): File "/home/cuimi/anaconda3/envs/tensorflow3/lib/python3.6/threading.py", line 916, in _bootstrap_inner self.run() File "/home/cuimi/anaconda3/envs/tensorflow3/lib/python3.6/threading.py", line 864, in run self._target(*self._args, **self._kwargs) File "/data/liangweiqi/Tacotron-2/wavenet_vocoder/feeder.py", line 230, in _enqueue_next_test_group self._session.run(self._eval_enqueue_op, feed_dict=feed_dict) File "/home/cuimi/anaconda3/envs/tensorflow3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 929, in run run_metadata_ptr) File "/home/cuimi/anaconda3/envs/tensorflow3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _run feed_dict_tensor, options, run_metadata) File "/home/cuimi/anaconda3/envs/tensorflow3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run run_metadata) File "/home/cuimi/anaconda3/envs/tensorflow3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.CancelledError: Run call was cancelled
i got same error, when i train on multi gpus.
I intended to train WaveNet with multi-GPU setting, since the hyper params in paper_hparams.py:
will meet the OOM issue in single GPU setting, unless I decrease one of the params above.
I also notice that there already has the multi-GPU logic in
wavenet_vocoder/train.py
, however when I start my program withpython train.py --hparams wavenet_batch_size=4,wavenet_num_gpus=2
, I encounter the following error message: