NVIDIA / tacotron2

Tacotron 2 - PyTorch implementation with faster-than-realtime inference
BSD 3-Clause "New" or "Revised" License
5.06k stars 1.38k forks source link

Unable To Start New Model or Synthesis Text To Speech #546

Closed gmirsky2 closed 2 years ago

gmirsky2 commented 2 years ago

When I train my own pretrained model that I have been training for months, it works. if I try to create a new one or I try to synthesis text to speech I get the error

I am unable to start a new model as I receive an Error. I have narrowed it down to loading the checkpoint. I believe. I am using my own code but I have tried multiple notebooks.(Cookie, bfs, etc...) and they all give the same error.

Load Checkpoint Code:

# Load checkpoint if one exists
    iteration = 0
    epoch_offset = 0
    if checkpoint_path is not None and os.path.isfile(checkpoint_path):
        if warm_start:
            model = warm_start_model(
                checkpoint_path, model, hparams.ignore_layers)
        else:
            model, optimizer, _learning_rate, iteration = load_checkpoint(
                checkpoint_path, model, optimizer)
            if hparams.use_saved_learning_rate:
                learning_rate = _learning_rate
            iteration += 1  # next iteration is iteration + 1
            epoch_offset = max(0, int(iteration / len(train_loader)))
    else:
      os.path.isfile("pretrained_model")
      download_from_google_drive("1c5ZTuT7J08wLUoVZ2KkUs_VdZuJ86ZqA","pretrained_model")
      model = warm_start_model("pretrained_model", model, hparams.ignore_layers)
      # download LJSpeech pretrained model if no checkpoint already exists

It seems the LJSpeech pretrained model has become corrupt in some way. Can You help with this? or point me in the right direct to get this fixed?

Train Code:

train(output_directory, log_directory, checkpoint_path,
      warm_start, n_gpus, rank, group_name, hparams, log_directory2)

Error:

FP16 Run: False
Dynamic Loss Scaling: True
Distributed Run: False
cuDNN Enabled: True
cuDNN Benchmark: False
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1555  100  1555    0     0  77750      0 --:--:-- --:--:-- --:--:-- 77750
Warm starting model from checkpoint 'pretrained_model'
---------------------------------------------------------------------------
UnpicklingError                           Traceback (most recent call last)
[<ipython-input-11-e8a871a0e98d>](https://localhost:8080/#) in <module>()
      5 print('cuDNN Benchmark:', hparams.cudnn_benchmark)
      6 train(output_directory, log_directory, checkpoint_path,
----> 7       warm_start, n_gpus, rank, group_name, hparams, log_directory2)

3 frames
[/usr/local/lib/python3.7/dist-packages/torch/serialization.py](https://localhost:8080/#) in _legacy_load(f, map_location, pickle_module, **pickle_load_args)
    775             "functionality.")
    776 
--> 777     magic_number = pickle_module.load(f, **pickle_load_args)
    778     if magic_number != MAGIC_NUMBER:
    779         raise RuntimeError("Invalid magic number; corrupt file?")

UnpicklingError: invalid load key, '<'.

I have tried researching this pickling error... and after about 10 mins I figured out it is built into torch.... there is nothing I can change in the code. to my knowledge

Id also like to point out this part of the code 779 raise RuntimeError("Invalid magic number; corrupt file?")

gmirsky2 commented 2 years ago

I have tried !tf_upgrade_v2

Thoerix commented 2 years ago

I have found out why it is throwing errors. Colab is having some weird error with downloading the model. It's not downloading it but replacing it with this:

< !DOCTYPE html> < html lang=en> < meta charset=utf-8> < meta name=viewport content="initial-scale=1, minimum-scale=1, width=device-width"> < title>Error 400 (Bad Request)!!1 < style> {margin:0;padding:0}html,code{font:15px/22px arial,sans-serif}html{background:#fff;color:#222;padding:15px}body{margin:7% auto 0;max-width:390px;min-height:180px;padding:30px 0 15px} > body{background:url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right:205px}p{margin:11px 0 22px;overflow:hidden}ins{color:#777;text-decoration:none}a img{border:0}@media screen and (max-width:772px){body{background:none;margin-top:0;max-width:none;padding-right:0}}#logo{background:url(//www.google.com/images/branding/googlelogo/1x/googlelogo_color_150x54dp.png) no-repeat;margin-left:-5px}@media only screen and (min-resolution:192dpi){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat 0% 0%/100% 100%;-moz-border-image:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) 0}}@media only screen and (-webkit-min-device-pixel-ratio:2){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat;-webkit-background-size:100% 100%}}#logo{display:inline-block;height:54px;width:150px} < a href=//www.google.com/>< span id=logo aria-label=Google> < p>< b>400. < ins>That’s an error. < p>Your client has issued a malformed or illegal request. < ins>That’s all we know.

I guess we gotta wait or you could potentially copy the inside code of the pretrained model into some file in Colab and rename it to be used as a model

Edit: I've found a solution! Find "train" function in that long chunk of code and replace:

download_from_google_drive("1c5ZTuT7J08wLUoVZ2KkUs_VdZuJ86ZqA","pretrained_model") model = warm_start_model("pretrained_model", model, hparams.ignore_layers)

with:

!pip install gdown import gdown gdown.download('https://drive.google.com/u/0/uc?export=download&confirm=kZ1A&id=1c5ZTuT7J08wLUoVZ2KkUs_VdZuJ86ZqA', "pretrained_model", quiet=False); model = warm_start_model("pretrained_model", model, hparams.ignore_layers)

wucepticon81 commented 2 years ago

Hey, I'm brand new to all this voice cloning stuff. I've only been using Tacotron2/colab notebook for a few weeks now. So I still don't understand most of the technical lingo here. I just know the basics with using the notebooks. I saw that it's no longer working in training mode. I see that there is a possible fix for this? How exactly do I do this? I'd greatly appreciate any help. Pls, and thank you.

gmirsky2 commented 2 years ago

Edit: I've found a solution! Find "train" function in that long chunk of code and replace:

download_from_google_drive("1c5ZTuT7J08wLUoVZ2KkUs_VdZuJ86ZqA","pretrained_model") model = warm_start_model("pretrained_model", model, hparams.ignore_layers)

with:

!pip install gdown import gdown gdown.download('https://drive.google.com/u/0/uc?export=download&confirm=kZ1A&id=1c5ZTuT7J08wLUoVZ2KkUs_VdZuJ86ZqA', "pretrained_model", quiet=False); model = warm_start_model("pretrained_model", model, hparams.ignore_layers)

Thank you very much!!! I was able to modify the code to download this new way. and I am now able to train And Create Voice.