Closed ghost closed 4 years ago
I am attempting to convert the existing synthesizer code to tensorflow v2, making some progress but could use some help with this error message. demo_cli.py gets as far as starting the synthesizer test.
Affected code is in custom_decoder.py
and helpers.py
in synthesizer/models. I have a branch here: https://github.com/blue-fish/Real-Time-Voice-Cloning/commits/370_tf2_compat
[Edit: This issue is resolved. See the history for the original error message]
I figured out a solution to the above issue, to use tf.TensorShape().with_rank() to increase the rank as needed. Now working through a different set of errors.
Edit: Although it makes that one error go away, I do not know if it is the correct fix so I have not committed it.
@blue-fish hi, can i contribute my Tensorflow2 tacotron2 to this repo ?, our framework also plan to support tflite for TTS model both real-time vocoder and Text2Mel model. Can you take a look our framework ?
github: https://github.com/TensorSpeech/TensorflowTTS sample audio: https://tensorspeech.github.io/TensorflowTTS/ colab demo: https://colab.research.google.com/drive/1akxtrLZHKuMiQup00tzO2olCaN-y3KiD?usp=sharing
we also supported FastSpeech2 which quality comparable with tacotron2 and much faster. Futhermore, we are working to support for other languages such as chinese, JP ...
@dathudeptrai Yes! That's much better than me trying to convert the existing code to Tensorflow2.
@blue-fish the most important thing is that we need to re-use the pretrained model here so converting the weight to be able to load on my Tacotron2 implementation is the right way :)). My implementation is 90% the same as the tacotron2 implementation here, just need modify some layer and parameter to replicate the model then we can ez to load the pretrained weight here to inference.
@dathudeptrai I am new to TTS and don't have the expertise to make the changes you are describing. Would you be willing to point out exactly what needs to be changed? Or submit a pull request (working or not) to get us started?
I figured out a solution to the above issue, to use tf.TensorShape().with_rank() to increase the rank as needed. Now working through a different set of errors.
Edit: Although it makes that one error go away, I do not know if it is the correct fix so I have not committed it.
Can you tell how you fixed 'Shape must be rank 1 but is rank 0' error? I guess it is 'batch_size' in line 98 tacotron.py
@DRob81 Although I solved it once before, it is eluding me this time. I thought I added with_rank() somewhere in synthesizer/models/helpers.py or custom_decoder.py, but it is not working. Thank you for your guess, I inspected batch_size
in the debugger for tensorflow v1 and v2 and could not find any difference. I still think it is somewhere in the custom decoder, based on the error message.
@blue-fish i also debugged and ended up with batch_size. Glad i found that batch_size = tf.TensorShape(0).with_rank_at_least(1)[0] is the solution here. Still more problems to fix
Thanks for sharing that @DRob81 . If I make that change then the next error message is TypeError: can only concatenate list (not "int") to list
at line 218 of synthesizer/models/tacotron.py . It's hard to tell if it's getting further than before, since that is the same line where it errored out before. It is very hard to debug errors with the custom decoder.
@blue-fish i fixed that already. Can you tell me which line exactly? I think my line numbers differ from yours now.
I'm unable to get past this part, before or after the last fix: https://github.com/blue-fish/Real-Time-Voice-Cloning/blob/621f62f150f5d0995ce61930479bec9e9043aebe/synthesizer/models/tacotron.py#L212-L218
It might be easier if you fork my repo using these instructions (https://github.com/CorentinJ/Real-Time-Voice-Cloning/issues/401#issuecomment-653929209), that would make it easier to share code updates and discuss issues like these.
@blue-fish i forked your repo and i will commit my changes.
hi, keep in mind,
I think it can be good for the record to keep one brach tensorflow 1.X and go for a new master based on tensorflow 2. I just check and there is an "automatic code updater" provide by tsf. I just execute it. I provide the report. In short what y do :
import tensorflow as tf
print(tf.version)
2.2.0 in my case (debian testing)
Use the following script in the parent directory of the project directory
```bash
#!/bin/bash
tf_upgrade_v2 \
--intree Real-Time-Voice-Cloning/ \
--outtree Real-Time-Voice-Cloning_v2/ \
--reportfile report.txt
and will have a bunch of ouput and a report.txt
I just not look at the output or content in report.txt
But for check go in v2 directory, edit the requirement.txt and set tensorflow==2.2.0
retry the pip install to be sure :
pip install -r requirements.txt
Everythings look satisfied. No have the time to check more. Feel free to ping me and try to continue that if i have time this week end.
Oh and i just follow tensorflow documentation a least some part https://www.tensorflow.org/guide/migrate https://www.tensorflow.org/guide/upgrade
@HumanG33k I have already performed automatic conversion using that process on the 370_tf2_compat
branch of my fork. There are still a bunch of errors that need to be worked through. I have published fixes for some of these, and getting stuck on some others where @DRob81 is also helping. The current errors are runtime so we may be getting close.
If you can run demo_cli.py
without errors, please commit those changes to your fork and we can continue developing from there. If not, let's concentrate the effort on my tensorflow2 fork. I am accepting pull requests.
@DRob81 I would like to continue the tensorflow2 effort, can you please commit your changes or submit a pull request to my fork?
Hello, @blue-fish please how can I add another voice on your colab (I'd like to upload or link it to a 5 - 10 seconds wav sample)? thanks!
https://colab.research.google.com/drive/1akxtrLZHKuMiQup00tzO2olCaN-y3KiD
We are not going to pursue tensorflow v2 now that the torch-based synthesizer is working (#472). Thanks to all who contributed their time here.
In #364, @CorentinJ wrote:
The collab PR (#338) is an update to tensorflow 1.15.2 I believe. With #366 the rest of the repo also advances to 1.15.
One of the best things about this repo is that it works well out of the box. After set up, one can clone a voice with pretrained models, or replicate the original training procedure by following some simple instructions on the wiki. This needs to be the case no matter what solution is pursued here.
So far I have found these options which seem promising:
I think the easier approach is to try to support tensorflow v2 by using Mozilla TTS. It also has a larger userbase and better community support. If we switch, will the existing pretrained model continue to work?