Does tacotron2.inference support multi-threading or thread-safe?

ttsking commented 2 years ago

I use gunicorn (gevent mode) + flask to setup a TTS webservice, if two requests come, the following code may raise exception. Should i add threading.Semaphore to look tacotron2.inference as a shared resource? My environment is TensorFlow 2.6.2 with a single GPU card

decoder_output, mel_outputs, stop_token_prediction, alignment_history = tacotron2.inference( input_ids=tf.expand_dims(tf.convert_to_tensor(input_ids, dtype=tf.int32), 0), input_lengths=tf.convert_to_tensor([len(input_ids)], tf.int32), speaker_ids=tf.convert_to_tensor([0], dtype=tf.int32), )

[2022-04-16 18:06:03 +0000] [371] [ERROR] Error handling request Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/gunicorn/workers/base_async.py", line 115, in handle_request for item in respiter: File "/usr/local/lib/python3.8/dist-packages/werkzeug/wsgi.py", line 462, in next return self._next() File "/usr/local/lib/python3.8/dist-packages/werkzeug/wrappers/response.py", line 49, in _iter_encoded for item in iterable: File "/work/official_tts/run_keras_server.py", line 494, in rest_api_tts decoder_output, mel_outputs, stop_token_prediction, alignment_history = tacotron2.inference( File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/eager/def_function.py", line 885, in call result = self._call(*args, kwds) File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/eager/def_function.py", line 924, in _call results = self._stateful_fn(*args, *kwds) File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/eager/function.py", line 3039, in call return graph_function._call_flat( File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/eager/function.py", line 1977, in _call_flat flat_outputs = forward_function.call(ctx, args_with_tangents) File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/eager/function.py", line 619, in call outputs = functional_ops.partitioned_call( File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/ops/functional_ops.py", line 1189, in partitioned_call args = [ops.convert_to_tensor(x) for x in args] File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/ops/functional_ops.py", line 1189, in args = [ops.convert_to_tensor(x) for x in args] File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/profiler/trace.py", line 163, in wrapped return func(args, kwargs) File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/framework/ops.py", line 1525, in convert_to_tensor raise RuntimeError("Attempting to capture an EagerTensor without " RuntimeError: Attempting to capture an EagerTensor without building a function.

ttsking commented 2 years ago

@dathudeptrai can you help to explain whether inference in Tensorflow is thread-safe?

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

TensorSpeech / TensorFlowTTS

Does tacotron2.inference support multi-threading or thread-safe? #757