NVIDIA / OpenSeq2Seq

Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
https://nvidia.github.io/OpenSeq2Seq
Apache License 2.0
1.55k stars 369 forks source link

tacotron_gst.py inference on CPU #439

Open mrgloom opened 5 years ago

mrgloom commented 5 years ago

python run.py --config_file=example_configs/text2speech/tacotron_gst.py --mode=infer --infer_output_file=unused

*** Building graph on GPU:0
Traceback (most recent call last):
  File "run.py", line 103, in <module>
    main()
  File "run.py", line 78, in main
    args, base_config, config_module, base_model, hvd, checkpoint)
  File "/Users/my_user/external_projects/text-to-speech/OpenSeq2Seq/open_seq2seq/utils/utils.py", line 883, in create_model
    model.compile(checkpoint=checkpoint)
  File "/Users/my_user/external_projects/text-to-speech/OpenSeq2Seq/open_seq2seq/models/model.py", line 402, in compile
    self.get_data_layer(gpu_cnt).build_graph()
  File "/Users/my_user/external_projects/text-to-speech/OpenSeq2Seq/open_seq2seq/data/text2speech/text2speech.py", line 299, in build_graph
    self._dataset = tf.data.Dataset.from_tensor_slices(self._files)
  File "/usr/local/lib/python3.6/site-packages/tensorflow/python/data/ops/dataset_ops.py", line 289, in from_tensor_slices
    return TensorSliceDataset(tensors)
  File "/usr/local/lib/python3.6/site-packages/tensorflow/python/data/ops/dataset_ops.py", line 1565, in __init__
    for i, t in enumerate(nest.flatten(tensors))
  File "/usr/local/lib/python3.6/site-packages/tensorflow/python/data/ops/dataset_ops.py", line 1565, in <listcomp>
    for i, t in enumerate(nest.flatten(tensors))
  File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1050, in convert_to_tensor
    as_ref=False)
  File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1146, in internal_convert_to_tensor
    ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
  File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py", line 229, in _constant_tensor_conversion_function
    return constant(v, dtype=dtype, name=name)
  File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py", line 208, in constant
    value, dtype=dtype, shape=shape, verify_shape=verify_shape))
  File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/tensor_util.py", line 542, in make_tensor_proto
    append_fn(tensor_proto, proto_values)
  File "tensorflow/python/framework/fast_tensor_util.pyx", line 127, in tensorflow.python.framework.fast_tensor_util.AppendObjectArrayToTensorProto
  File "/usr/local/lib/python3.6/site-packages/tensorflow/python/util/compat.py", line 61, in as_bytes
    (bytes_or_text,))
TypeError: Expected binary or unicode string, got nan

Setting "num_gpus": 0 produce another error:

Traceback (most recent call last):
  File "run.py", line 103, in <module>
    main()
  File "run.py", line 78, in main
    args, base_config, config_module, base_model, hvd, checkpoint)
  File "/Users/my_user/external_projects/text-to-speech/OpenSeq2Seq/open_seq2seq/utils/utils.py", line 882, in create_model
    model = base_model(params=infer_config, mode=args.mode, hvd=hvd)
  File "/Users/my_user/external_projects/text-to-speech/OpenSeq2Seq/open_seq2seq/models/text2speech.py", line 215, in __init__
    super(Text2Speech, self).__init__(params, mode=mode, hvd=hvd)
  File "/Users/my_user/external_projects/text-to-speech/OpenSeq2Seq/open_seq2seq/models/encoder_decoder.py", line 76, in __init__
    self._decoder = self._create_decoder()
  File "/Users/my_user/external_projects/text-to-speech/OpenSeq2Seq/open_seq2seq/models/encoder_decoder.py", line 102, in _create_decoder
    return self.params['decoder'](params=params, mode=self.mode, model=self)
  File "/Users/my_user/external_projects/text-to-speech/OpenSeq2Seq/open_seq2seq/decoders/tacotron2_decoder.py", line 212, in __init__
    self._n_feats = self._model.get_data_layer().params['num_audio_features']
  File "/Users/my_user/external_projects/text-to-speech/OpenSeq2Seq/open_seq2seq/models/model.py", line 901, in get_data_layer
    return self._data_layers[worker_id]
IndexError: list index out of range
mrgloom commented 5 years ago

Also I have tried it on GPU with tensorflow-gpu==1.10.0:

python run.py --config_file=example_configs/text2speech/tacotron_gst.py --mode=infer --infer_output_file=unused

*** Building graph on GPU:0
Traceback (most recent call last):
  File "run.py", line 103, in <module>
    main()
  File "run.py", line 78, in main
    args, base_config, config_module, base_model, hvd, checkpoint)
  File "/data/my_user_data/external_projects/text-to-speech/OpenSeq2Seq/open_seq2seq/utils/utils.py", line 891, in create_model
    model.compile(checkpoint=checkpoint)
  File "/data/my_user_data/external_projects/text-to-speech/OpenSeq2Seq/open_seq2seq/models/model.py", line 402, in compile
    self.get_data_layer(gpu_cnt).build_graph()
  File "/data/my_user_data/external_projects/text-to-speech/OpenSeq2Seq/open_seq2seq/data/text2speech/text2speech.py", line 316, in build_graph
    self._dataset = tf.data.Dataset.from_tensor_slices(self._files)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/data/ops/dataset_ops.py", line 254, in from_tensor_slices
    return TensorSliceDataset(tensors)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/data/ops/dataset_ops.py", line 1166, in __init__
    for i, t in enumerate(nest.flatten(tensors))
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/data/ops/dataset_ops.py", line 1166, in <listcomp>
    for i, t in enumerate(nest.flatten(tensors))
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 998, in convert_to_tensor
    as_ref=False)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1094, in internal_convert_to_tensor
    ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/constant_op.py", line 217, in _constant_tensor_conversion_function
    return constant(v, dtype=dtype, name=name)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/constant_op.py", line 196, in constant
    value, dtype=dtype, shape=shape, verify_shape=verify_shape))
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/tensor_util.py", line 536, in make_tensor_proto
    append_fn(tensor_proto, proto_values)
  File "tensorflow/python/framework/fast_tensor_util.pyx", line 120, in tensorflow.python.framework.fast_tensor_util.AppendObjectArrayToTensorProto
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/compat.py", line 61, in as_bytes
    (bytes_or_text,))
TypeError: Expected binary or unicode string, got nan
mrgloom commented 5 years ago

It was a problem in *.csv file format.

After successfully running on GPU, on CPU it still produce error:

*** Building graph on GPU:0
*** Inference Mode. Loss part of graph isn't built.
2019-05-20 16:42:05.832620: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
*** WARNING: Can't compute number of objects per step, since train model does not define get_num_objects_per_step method.
/usr/local/lib/python3.6/site-packages/librosa/effects.py:490: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
  return y[full_index], np.asarray([start, end])
Bus error: 10
blisc commented 5 years ago

I have not run into the same Bus Error but it does seem like tacotron gst does not work for CPU. My local runs doesn't seem to finish. I do not think we will be fixing this bug, if a solution is found, a pull request would be appreciated.

TheBrownViking20 commented 5 years ago

It was a problem in *.csv file format.

After successfully running on GPU, on CPU it still produce error:

*** Building graph on GPU:0
*** Inference Mode. Loss part of graph isn't built.
2019-05-20 16:42:05.832620: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
*** WARNING: Can't compute number of objects per step, since train model does not define get_num_objects_per_step method.
/usr/local/lib/python3.6/site-packages/librosa/effects.py:490: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
  return y[full_index], np.asarray([start, end])
Bus error: 10

What was the error in CSV though? I am facing a similar issue.

Shujian2015 commented 5 years ago

@TheBrownViking20 , Please make sure there is no "" in the transcript. You can replace it with " ".

Ref: https://stackoverflow.com/questions/43183661/expected-binary-or-unicode-string-got-nan-tensorflow-pandas