syang1993 / gst-tacotron

A tensorflow implementation of the "Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis"
368 stars 110 forks source link

data feeder error #6

Closed fazlekarim closed 6 years ago

fazlekarim commented 6 years ago

after running the code for about 200 steps, I am running into the following error. I can't figure out why. I feel like it should be an easy fix., feed_dict=feed_dict) File "/home/fakarim/anaconda3/envs/gst-tacotron/lib/python3.6/site-packages/tensorflow/python/client/", line 889, in run run_metadata_ptr) File "/home/fakarim/anaconda3/envs/gst-tacotron/lib/python3.6/site-packages/tensorflow/python/client/", line 1120, in _run feed_dict_tensor, options, run_metadata) File "/home/fakarim/anaconda3/envs/gst-tacotron/lib/python3.6/site-packages/tensorflow/python/client/", line 1317, in _do_run options, run_metadata) File "/home/fakarim/anaconda3/envs/gst-tacotron/lib/python3.6/site-packages/tensorflow/python/client/", line 1336, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.CancelledError: Enqueue operation was cancelled [[Node: datafeeder/input_queue_enqueue = QueueEnqueueV2[Tcomponents=[DT_INT32, DT_INT32, DT_FLOAT, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](datafeeder/input_queue, _arg_datafeeder/inputs_0_1, _arg_datafeeder/input_lengths_0_0, _arg_datafeeder/mel_targets_0_3, _arg_datafeeder/linear_targets_0_2)]]

Caused by op 'datafeeder/input_queue_enqueue', defined at: File "", line 153, in main() File "", line 149, in main train(log_dir, args) File "", line 58, in train feeder = DataFeeder(coord, input_path, hparams) File "/home/fakarim/projects/gst-tacotron/datasets/", line 46, in init self._enqueue_op = queue.enqueue(self._placeholders) File "/home/fakarim/anaconda3/envs/gst-tacotron/lib/python3.6/site-packages/tensorflow/python/ops/", line 327, in enqueue self._queue_ref, vals, name=scope) File "/home/fakarim/anaconda3/envs/gst-tacotron/lib/python3.6/site-packages/tensorflow/python/ops/", line 2777, in _queue_enqueue_v2 timeout_ms=timeout_ms, name=name) File "/home/fakarim/anaconda3/envs/gst-tacotron/lib/python3.6/site-packages/tensorflow/python/framework/", line 787, in _apply_op_helper op_def=op_def) File "/home/fakarim/anaconda3/envs/gst-tacotron/lib/python3.6/site-packages/tensorflow/python/framework/", line 2956, in create_op op_def=op_def) File "/home/fakarim/anaconda3/envs/gst-tacotron/lib/python3.6/site-packages/tensorflow/python/framework/", line 1470, in init self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

CancelledError (see above for traceback): Enqueue operation was cancelled [[Node: datafeeder/input_queue_enqueue = QueueEnqueueV2[Tcomponents=[DT_INT32, DT_INT32, DT_FLOAT, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](datafeeder/input_queue, _arg_datafeeder/inputs_0_1, _arg_datafeeder/input_lengths_0_0, _arg_datafeeder/mel_targets_0_3, _arg_datafeeder/linear_targets_0_2)]]

fazlekarim commented 6 years ago

I think I am running out of memory. What type of GPU and how many GPU are you using? Is there a memory leakage somewhere? It makes no sense why I run out of memory after around 200 steps.


fazlekarim commented 6 years ago

fixed it. code is fine. i just over reacted

marymirzaei commented 6 years ago

@fazlekarim I am having a similar issue. How did you solve this?

syang1993 commented 6 years ago

@fazlekarim @lapwing It is an OOM error. Since there are some too long sentences, it may throw OOM error at some step. You can fix it by:

  1. Recude batch_size or increase the reduce_factor. (Changing reduce factor will affect the performance.)
  2. Remove those too long sentences. For example, remove all sentences which are longer than 1200 frames. This will decrease the data size a little, but I guess it will not attect the performance too much.
fazlekarim commented 6 years ago

Do you have a script to remove sentences greater than 1200 frames?

syang1993 commented 6 years ago

@fazlekarim A simple way is to modify the data process script as attached.