TensorSpeech / TensorFlowTTS

:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
https://tensorspeech.github.io/TensorFlowTTS/
Apache License 2.0
3.82k stars 812 forks source link

RuntimeError when trying to inference From TFlite for Fastspeech2 #237

Closed Zak-SA closed 3 years ago

Zak-SA commented 4 years ago

Hi, So I converted Fastspeech2 model to TFlite, when I tried to inference from TFlite I am getting this error

decoder_output_tflite, mel_output_tflite = infer(input_text) interpreter.invoke() File "/home/zak/venv/lib/python3.8/site-packages/tensorflow/lite/python/interpreter.py", line 539, in invoke self._interpreter.Invoke() RuntimeError: tensorflow/lite/kernels/reshape.cc:55 stretch_dim != -1 (0 != -1)Node number 83 (RESHAPE) failed to prepare.

the code I used for this purpose is

import numpy as np import yaml import tensorflow as tf

from tensorflow_tts.processor import ZAKSpeechProcessor from tensorflow_tts.processor.ZAKspeech import ZAKSPEECH_SYMBOLS

from tensorflow_tts.configs import FastSpeechConfig, FastSpeech2Config from tensorflow_tts.configs import MultiBandMelGANGeneratorConfig

from tensorflow_tts.models import TFFastSpeech, TFFastSpeech2 from tensorflow_tts.models import TFMBMelGANGenerator

from IPython.display import Audio

Load the TFLite model and allocate tensors.

interpreter = tf.lite.Interpreter(model_path='fastspeech2_quant.tflite')

Get input and output tensors.

input_details = interpreter.get_input_details() output_details = interpreter.get_output_details()

Prepare input data.

def prepare_input(input_ids): input_ids = tf.expand_dims(tf.convert_to_tensor(input_ids, dtype=tf.int32), 0) return (input_ids, tf.convert_to_tensor([0], tf.int32), tf.convert_to_tensor([1.0], dtype=tf.float32), tf.convert_to_tensor([1.0], dtype=tf.float32), tf.convert_to_tensor([1.0], dtype=tf.float32))

Test the model on random input data.

def infer(input_text): for x in input_details: print(x) for x in output_details: print(x) processor = ZAKSpeechProcessor(data_dir=None, symbols=ZAKSPEECH_SYMBOLS, cleaner_names="arabic_cleaners") input_ids = processor.text_to_sequence(input_text.lower()) interpreter.resize_tensor_input(input_details[0]['index'], [1, len(input_ids)]) interpreter.resize_tensor_input(input_details[1]['index'], [1]) interpreter.resize_tensor_input(input_details[2]['index'], [1]) interpreter.resize_tensor_input(input_details[3]['index'], [1]) interpreter.resize_tensor_input(input_details[4]['index'], [1]) interpreter.allocate_tensors() input_data = prepare_input(input_ids) for i, detail in enumerate(input_details): input_shape = detail['shape'] interpreter.set_tensor(detail['index'], input_data[i])

interpreter.invoke()

The function get_tensor() returns a copy of the tensor data.

Use tensor() in order to get a pointer to the tensor.

return (interpreter.get_tensor(output_details[0]['index']), interpreter.get_tensor(output_details[1]['index']))

initialize melgan model

with open('../examples/multiband_melgan/conf/multiband_melgan.v1.yaml') as f: mb_melgan_config = yaml.load(f, Loader=yaml.Loader) mb_melgan_config = MultiBandMelGANGeneratorConfig(**mb_melgan_config["multiband_melgan_generator_params"]) mb_melgan = TFMBMelGANGenerator(config=mb_melgan_config, name='mb_melgan_generator') mb_melgan._build() mb_melgan.load_weights("../examples/multiband_melgan/exp/train.multiband_melgan.v1/checkpoints/generator-1000000.h5")

input_text = ""

decoder_output_tflite, mel_output_tflite = infer(input_text) audio_before_tflite = mb_melgan(decoder_output_tflite)[0, :, 0] audio_after_tflite = mb_melgan(mel_output_tflite)[0, :, 0]

appreciate your help

Zak-SA commented 4 years ago

And one more question (or request): can you please share the code and the process to convert mb-melgan to TFlite? Thanks

manmay-nakhashi commented 4 years ago

@Zak-SA https://github.com/tensorflow/tensorflow/issues/40504

manmay-nakhashi commented 4 years ago

@Zak-SA if you are using nightly version convert your model using 2.4.0-dev20200630 and you can run it or you can use 2.3.0 for stable version. for running and converting tflite graph

Zak-SA commented 4 years ago

@manmay-nakhashi i got the error when I used the nightly version 2.4.0. I didn't try Tensorflow-gpu==2.3

anhtienng commented 4 years ago

I also got this error ! Have you fixed it @Zak-SA ?

Zak-SA commented 4 years ago

@tienanh-1999 unfortunately not yet, still waiting to debug the issue. I did get the time to work on it. will post any updates as soon as I have any.

JuliRao commented 3 years ago

@Zak-SA Hi, I also encounter this bug with tf_nightly-2.4.0.dev2020916, I use a different version of tf_nightly-2.4.0.dev2020630 and the bug is gone. Maybe you can try, too.

abylouw commented 3 years ago

I can confirm that the tf_nightly-2.4.0.dev2020630 Python library indeed works, but I also have not been able to get the C library working.

dathudeptrai commented 3 years ago

I can confirm that the tf_nightly-2.4.0.dev2020630 Python library indeed works, but I also have not been able to get the C library working.

about C library, maybe we should ask @ZDisket :v. You can refer his tensorvox here (https://github.com/ZDisket/TensorVox, will be intergrated to tensorspeech in the future :)), maybe :v: ).

abylouw commented 3 years ago

From what I can see @ZDisket is using the Tensorflow C++ library in TensorVox. I am trying to use the TensorFlow Lite C library. I got it working, at least on FastSpeech2 the inference is working as expected, I still need to try MBMelGan. I needed to add the Flex delegate library. It is still a bit large, but at least inference is working.

ZDisket commented 3 years ago

@abylouw I'd like to know how your TFLite mb-melgan runs. For me the audio was very noisy.

dathudeptrai commented 3 years ago

@abylouw I'd like to know how your TFLite mb-melgan runs. For me the audio was very noisy.

Maybe because you used 8bit tflite :))) for melgan, you should use float16 or float32

ZDisket commented 3 years ago

@dathudeptrai Never saw that documented, but I'll keep it in mind @abylouw

is using the Tensorflow C++ library

Actually, I'm using the TF C library and a third-party wrapper (it's way easier than directly compiling the cpp one) for everything except the G2P (Phonetisaurus), but I'm working on changing that.

abylouw commented 3 years ago

I will post a sample as soon as I am done. For now I have only tested FS2 and then used the Python Tensorflow library for vocoding on the tflite produced mels. That sounded almost the same as the Tensorflow FS2 produced mels.

I am first trying to see if I can get it fast enough for my purposes.

abylouw commented 3 years ago

@ZDisket My findings currently are:

I will revisit this in the future, but for know I will probably also use the TF C library.

dathudeptrai commented 3 years ago

@abylouw I will make a notebook for mb-melgan tflite convert, it almost the same as fs2/tacotron2 conveter. For mb-melgan, i do not know why you need flex delegate, the only model on this repo need to use flex delegate is tacotron2 (while loop in decoder) :D. For mb-melgan, everything is much simpler :D.

ZDisket commented 3 years ago

@dathudeptrai Flex delegate is required for FastSpeech2 TFLite because it can't be exported with only tflite built-in ops and it's required for regular TF ops (at least the last time I tried)

dathudeptrai commented 3 years ago

@dathudeptrai Flex delegate is required for FastSpeech2 TFLite because it can't be exported with only tflite built-in ops and it's required for regular TF ops (at least the last time I tried)

hmm, maybe you are right :D, my private fastspeech2 is different so I do not need flex delegate :D. But i mean in real android device :)), for python API, seem we do not need flex delegate to run FS2 tflite as in demo colab :D.

unparalleled-ysj commented 3 years ago

@dathudeptrai @manmay-nakhashi When I used tf-nightly==2.4.0.dev20201020, I also encountered this error: RuntimeError: tensorflow/lite/kernels/reshape.cc:55 stretch_dim != -1 (0 != -1)Node number 83 (RESHAPE) failed to prepare. After reading your discussion, I used tf-nightly==2.4.0.dev20200630 and the stable version of tf2.3.0 for conversion, but both encountered this error: Segmentation fault (core dumped). Can someone give some pointers

dathudeptrai commented 3 years ago

@dathudeptrai @manmay-nakhashi When I used tf-nightly==2.4.0.dev20201020, I also encountered this error: RuntimeError: tensorflow/lite/kernels/reshape.cc:55 stretch_dim != -1 (0 != -1)Node number 83 (RESHAPE) failed to prepare. After reading your discussion, I used tf-nightly==2.4.0.dev20200630 and the stable version of tf2.3.0 for conversion, but both encountered this error: Segmentation fault (core dumped). Can someone give some pointers

can you share ur colab ?

dathudeptrai commented 3 years ago

@unparalleled-ysj can you help me create a colab :)

unparalleled-ysj commented 3 years ago

@dathudeptrai sorry, I have never used Colaboratoy, it took me some time https://colab.research.google.com/drive/1_PNAtx94Gn16ZCjxih5YC-mi9BAondSX?usp=sharing

unparalleled-ysj commented 3 years ago

@dathudeptrai Hi, is there any progress or is there anything I need to do

dathudeptrai commented 3 years ago

@unparalleled-ysj can you try this colab (https://colab.research.google.com/drive/1HudLLpT9CQdh2k04c06bHUwLubhGTWxA?usp=sharing)

tingyang01 commented 3 years ago

@dathudeptrai Hi, I'm also getting the same error: Traceback (most recent call last): File "J:/Work/2017/SVNWork/Samy/TTS/TensorFlowTTS-master/Hindimodel/hi_thomas/fastspeech2_tflight.py", line 90, in infer self.interpreter.invoke() File "J:\Work\2017\SVNWork\Samy\TTS\TensorFlowTTS-master\Hindimodel\hi_thomas\tflight\lib\site-packages\tensorflow\lite\python\interpreter.py", line 539, in invoke self._interpreter.Invoke() RuntimeError: tensorflow/lite/kernels/reshape.cc:55 stretch_dim != -1 (0 != -1)Node number 83 (RESHAPE) failed to prepare.

I am using this code:

class FastSpeech2Lite: def init(self, path, processor):

Load the TFLite model and allocate tensors.

    self.interpreter = tf.lite.Interpreter(model_path=path)

    # Get input and output tensors.
    self.inputs = self.interpreter.get_input_details()
    self.outputs = self.interpreter.get_output_details()
    self.processor = processor
    self.current_shape = self.inputs[0]['shape']

def printDetails(self):
    for x in self.inputs:
        print(x)
    for x in self.outputs:
        print(x)

def prepare_input(self, input_ids):
    input_ids = tf.expand_dims(tf.convert_to_tensor(input_ids, dtype=tf.int32), 0)
    return (input_ids,
            tf.convert_to_tensor([0], tf.int32),
            tf.convert_to_tensor([1.0], dtype=tf.float32),
            tf.convert_to_tensor([1.0], dtype=tf.float32),
            tf.convert_to_tensor([1.0], dtype=tf.float32))

# Test the model on random input data.
def infer(self, input_text):
    self.printDetails()
    input_ids = self.processor.text_to_sequence(input_text)
    # Resize if input length different, assuming batch size is always 1.
    if self.current_shape[1] is not len(input_ids):
        print(f"Input shape: {[1, len(input_ids)]} , interpreter shape: {self.current_shape}")
        print("Warning: Latency might be affected due to change in input shape")
        self.current_shape = [1, len(input_ids)]
        self.interpreter.resize_tensor_input(self.inputs[0]['index'], [1, len(input_ids)])
        self.interpreter.resize_tensor_input(self.inputs[1]['index'], [1])
        self.interpreter.resize_tensor_input(self.inputs[2]['index'], [1])
        self.interpreter.resize_tensor_input(self.inputs[3]['index'], [1])
        self.interpreter.resize_tensor_input(self.inputs[4]['index'], [1])
        self.interpreter.allocate_tensors()
    input_data = self.prepare_input(input_ids)
    for i, detail in enumerate(self.inputs):
        input_shape = detail['shape']
        self.interpreter.set_tensor(detail['index'], input_data[i])
    self.interpreter.invoke()
    # The function `get_tensor()` returns a copy of the tensor data.
    # Use `tensor()` in order to get a pointer to the tensor.
    # Decoder output skipped
    return self.interpreter.get_tensor(self.outputs[1]['index'])

processor = AutoProcessor.from_pretrained(pretrained_path=f"./ljspeech_mapper.json") fs2_h5_path = f"fs2_model-1505000.h5" fs2_conf_path = f"./fastspeech2.v1.yaml" fs2_lite_path = f"saved_models_lite/hindi" os.makedirs(fs2_lite_path, exist_ok=True)

save_fastspeech2_to_lite( fs2_h5_path, fs2_conf_path, fs2_lite_path, processor )

text = "Hello, Nice to meet you." fastSpeech2lite = FastSpeech2Lite(os.path.join(fs2_lite_path, "fastspeech2_quant.tflite"), processor) mel_output_tflite = fastSpeech2lite.infer(text) print(f"output mels : {mel_output_tflite}")

So I am trying to install tf-nightly==2.4.0.dev20200630 But I can not install this version due to this error: pip install tf-nightly-cpu==2.4.0-dev20200630 ERROR: Could not find a version that satisfies the requirement tf-nightly-cpu==2.4.0-dev20200630 (from versions: 2.4.0.dev20200901, 2.4.0.dev20200902, 2.4.0.dev20200903, 2.4.0.dev20200904, 2.4.0.dev20200905, 2.4.0.dev20200906, 2.4.0.dev20200907, 2.4.0.dev20200908, 2.4 .0.dev20200910, 2.4.0.dev20200911, 2.4.0.dev20200912, 2.4.0.dev20200915, 2.4.0.dev20200916, 2.4.0.dev20200924, 2.4.0.dev20200925, 2.4.0.dev20200926, 2.4.0.dev20200927, 2.4.0.dev20200928, 2.4.0.dev20200929, 2.4.0.dev20200930, 2.4.0.dev20201001, 2.4.0.dev20201002, 2.4.0 .dev20201003, 2.4.0.dev20201004, 2.4.0.dev20201005, 2.4.0.dev20201006, 2.4.0.dev20201007, 2.4.0.dev20201008, 2.4.0.dev20201009, 2.4.0.dev20201010, 2.4.0.dev20201011, 2.4.0.dev20201012, 2.4.0.dev20201015, 2.4.0.dev20201016, 2.4.0.dev20201023, 2.5.0.dev20201027, 2.5.0.d ev20201028, 2.5.0.dev20201029) ERROR: No matching distribution found for tf-nightly-cpu==2.4.0-dev20200630

Zak-SA commented 3 years ago

@tingyang01 How many steps did you train fastspeech2? I see the number 1505000 in fs2_h5_path Did you mean 150000?

I don't know about tf-nightly-cpu But I installed tf-nightly==2.4.0-dev20200630

tingyang01 commented 3 years ago

Thanks for your reply. 150000 means the epoch number. I also installed tf-nightly==2.4.0-dev20200630, but it was failed. I am using python 3.6 and pip 20.2.4 on windows. Let me share the log again. pip install tf-nightly==2.4.0-dev20200630 ERROR: Could not find a version that satisfies the requirement tf-nightly==2.4.0-dev20200630 (from versions: 2.4.0.dev20200901, 2.4.0.dev20200902, 2.4.0.dev20200903, 2.4.0.dev20200904, 2.4.0.dev20200905, 2.4.0.dev20200906, 2.4.0.dev20200907, 2.4.0.dev20200908, 2.4.0.d ev20200910, 2.4.0.dev20200911, 2.4.0.dev20200912, 2.4.0.dev20200924, 2.4.0.dev20200925, 2.4.0.dev20200926, 2.4.0.dev20200927, 2.4.0.dev20200928, 2.4.0.dev20200929, 2.4.0.dev20200930, 2.4.0.dev20201001, 2.4.0.dev20201002, 2.4.0.dev20201006, 2.4.0.dev20201007, 2.4.0.dev 20201008, 2.4.0.dev20201009, 2.4.0.dev20201010, 2.4.0.dev20201011, 2.4.0.dev20201012, 2.4.0.dev20201015, 2.4.0.dev20201016, 2.4.0.dev20201017, 2.4.0.dev20201018, 2.4.0.dev20201019, 2.4.0.dev20201020, 2.4.0.dev20201021, 2.4.0.dev20201022, 2.4.0.dev20201023, 2.5.0.dev20 201027, 2.5.0.dev20201028, 2.5.0.dev20201029) ERROR: No matching distribution found for tf-nightly==2.4.0-dev20200630

OscarVanL commented 3 years ago

I also have no idea how to use TFLite FS2 models in Python. I tried the colab notebook, but also tried the code you quoted above, and get the error:

    405       ValueError: If the interpreter could not set the tensor.
    406     """
--> 407     self._interpreter.SetTensor(tensor_index, value)
    408 
    409   def resize_tensor_input(self, input_index, tensor_size, strict=False):

ValueError: Cannot set tensor: Got value of type BOOL but expected type INT32 for input 1, name: speaker_ids 
lesswrongzh commented 3 years ago

@dathudeptrai Flex delegate is required for FastSpeech2 TFLite because it can't be exported with only tflite built-in ops and it's required for regular TF ops (at least the last time I tried)

hmm, maybe you are right :D, my private fastspeech2 is different so I do not need flex delegate :D. But i mean in real android device :)), for python API, seem we do not need flex delegate to run FS2 tflite as in demo colab :D.

@dathudeptrai Flex delegate is required for FastSpeech2 TFlite, if only import tflite_runtime (without tensorflow full package), it shows error:

RuntimeError: Regular TensorFlow ops are not supported by this interpreter. Make sure you apply/link the Flex delegate before inference.Node number 370 (FlexRandomUniform) failed to prepare

Code here: (dont work ) https://colab.research.google.com/drive/12PehpbPCPKYcLaFWX7B6VCCaIEq93E2j#scrollTo=y_h7FBydVNG_

(work) https://colab.research.google.com/drive/16tlyfE2Vdnd9hA5CkxaA5UvQ5zwKg6T2#scrollTo=y_h7FBydVNG_

convert to tflite model using https://colab.research.google.com/drive/1Ma3MIcSdLsOxqOKcN1MlElncYMhrOg3J#scrollTo=E4ClRmqr_gBk

dathudeptrai commented 3 years ago

@jackiegeek so, let import tensorflow :)).

lesswrongzh commented 3 years ago

@dathudeptrai I don't want to install tensorlfow T^T in some case

lesswrongzh commented 3 years ago
  • To get tflite working with an FS2 model you need to include the Flex delegate library, which bloats the size of tflite to 140+ MB.

Let me try , but I wonder is there any other solution?

tingyang01 commented 3 years ago

Yeah, sorry. I am not using Flex delegate. I've installed only tensorflow 2.3.1.

On Mon, Nov 23, 2020 at 5:35 PM raisongeek notifications@github.com wrote:

  • To get tflite working with an FS2 model you need to include the Flex delegate library, which bloats the size of tflite to 140+ MB.

Let me try To get tflite working with an FS2 model you need to include the Flex delegate library, which bloats the size of tflite to 140+ MB.

is there any other solution?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/TensorSpeech/TensorFlowTTS/issues/237#issuecomment-732041005, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALHSHJLLN4Z4M24JGOMKSMLSRIUEVANCNFSM4QPD3SXA .

lesswrongzh commented 3 years ago

ok... then I will wait until tflite support more Select TensorFlow operators |ू・ω・` ), BTW, TensorFlow Lite with select TensorFlow ops are available in the TensorFlow pip package version since 2.3 for Linux and 2.4 for other environments.

langfield commented 3 years ago

@dathudeptrai I am getting a similar issue with attempting to run FS2. It appears to require the flex delegate. You mentioned above that you have a version of the model which doesn't require the flex delegate. Any chance you could share that or push it in a branch?

dathudeptrai commented 3 years ago

@dathudeptrai I am getting a similar issue with attempting to run FS2. It appears to require the flex delegate. You mentioned above that you have a version of the model which doesn't require the flex delegate. Any chance you could share that or push it in a branch?

Hi, i'm still need flex delegate, to run the model that require a flex delegate, you should install tflite-support implementation 'org.tensorflow:tensorflow-lite-support:0.0.0-nightly' and use REAL android device rather than virtual machine.

langfield commented 3 years ago

Thanks for replying so quickly. I'm not running on android, I'm running on a Raspberry Pi. Sorry so can you explain what you meant here:

hmm, maybe you are right :D, my private fastspeech2 is different so I do not need flex delegate :D. But i mean in real android device :)), for python API, seem we do not need flex delegate to run FS2 tflite as in demo colab :D.

What do I do if I'm not building an AAR?

dathudeptrai commented 3 years ago

Thanks for replying so quickly. I'm not running on android, I'm running on a Raspberry Pi. Sorry so can you explain what you meant here:

hmm, maybe you are right :D, my private fastspeech2 is different so I do not need flex delegate :D. But i mean in real android device :)), for python API, seem we do not need flex delegate to run FS2 tflite as in demo colab :D.

What do I do if I'm not building an AAR?

hmm, in your case, you should build 2 tflite from fastspeech2 :))). Encoder + Decoder, they do not need flex delegate, for length regulator, you should code it by raw python (it's so simple). BTW, why don't you use saved model for Raspbery Pi rather than tflite :).

langfield commented 3 years ago

hmm, in your case, you should build 2 tflite from fastspeech2 :)))

How do I do that exactly? I've been exporting the models as done in the notebooks and then importing them in a C++ program, which I'm building using bazel as instructed in the TFLite docs. So only the length regulator requires the flex delegate?

Also, what do you mean by use the saved model?

Thanks very much for your patience.

dathudeptrai commented 3 years ago

hmm, in your case, you should build 2 tflite from fastspeech2 :)))

How do I do that exactly? I've been exporting the models as done in the notebooks and then importing them in a C++ program, which I'm building using bazel as instructed in the TFLite docs. So only the length regulator requires the flex delegate?

Also, what do you mean by use the saved model?

Thanks very much for your patience.

i mean tf.saved_model.save(....) (.pb file) and yes, only length regulator requires the flex delegate but it can be implement by raw python code easily :D .

langfield commented 3 years ago

This is what the interpreter complains about when I attempt to run FS2 without compiling with flex delegate. What specific lines in the length regulator are responsible for this? I don't see anything related to uniform there.

Node number 456 (FlexRandomUniform) failed to prepare.
dathudeptrai commented 3 years ago

This is what the interpreter complains about when I attempt to run FS2 without compiling with flex delegate. What specific lines in the length regulator are responsible for this? I don't see anything related to uniform there.

Node number 456 (FlexRandomUniform) failed to prepare.

the randomuniform from dropout function here (https://github.com/TensorSpeech/TensorFlowTTS/blob/master/tensorflow_tts/models/fastspeech2.py#L254-L259). In the inference, the model still enable dropout for f0/energy embedding :D.

langfield commented 3 years ago

Removed the dropout calls (but leaving the embeddings) fixed this issue. Flex delegate was no longer needed for inference/compilation, and inference appears to work correctly.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.