Closed freshwindy closed 1 year ago
Oh sorry... I will update soon
Looking forward to your update.I wonder if you can give me an email address to communicate.Thanks.
Hi @freshwindy. I'm so sorry to rate I update code. See make_mel_style.py. I also update model, see /MIST_tacotron and delete /tacotron2 and then change directory name /MIST_tacotron -> /tacotron2 It is make you easy to train.
Hi,Thank you for sharing.I have checked your updated code, but I still have a question, that is, I found your updated make mel Style.py also adds the image style migration model. Is this necessary for image processing? Because I saw this in your model also have style_encoder,which will lead to re-use of image style migration.
Hi, @freshwindy . I use mel-spectrogram feature like image. So we need image precessing. GST tacotron uses token extracted from mel-spectrogram as feature. But MIST tacotron uses image style transfered mel-spectrogram extracted from mel-spectrogram(make_mel_style.py) as feature
Close due to inactivity.
I found that the style image (style_img) needs to be used in training, but I looked at your data processing code. I found that there was no part about how the image came from, only the code to determine the image path according to the voice path.Does this mean that it is necessary to generate a Mel spectrogram for each audio.