coqui-ai / TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
http://coqui.ai
Mozilla Public License 2.0
34.78k stars 4.22k forks source link

TypeError: object of type 'NoneType' has no len() #1695

Closed bariscankurtkaya closed 2 years ago

bariscankurtkaya commented 2 years ago

Describe the bug

I'm using the TTS library to train my own models on AWS SageMaker Studio and also I trained some models with it but (v0.6) unfortunately after the v0.7.1, I can't train the Tacotron2 model with the same configuration. I also checked the dependency and I checked nothing changed. After the version update trainer.fit() function creates an error. You can find the error in the image. image

To Reproduce

This is my configurations: from TTS.tts.configs.tacotron2_config import Tacotron2Config config = Tacotron2Config( batch_size=32, eval_batch_size=16, num_loader_workers=4, num_eval_loader_workers=4, run_eval=True, test_delay_epochs=-1, epochs=1000, text_cleaner="phoneme_cleaners", use_phonemes=True, phoneme_language="en-us", phoneme_cache_path=os.path.join(output_path, "phoneme_cache"), print_step=25, print_eval=False, mixed_precision=True, output_path=output_path, datasets=[dataset_config], save_step=10000, )

and after that I also use from TTS.tts.models.tacotron2 import Tacotron2 model = Tacotron2(config, ap, tokenizer, speaker_manager=None)

and the last touch is trainer.fit() then voilà you have an error 😄

Expected behavior

It should start the train TTS model.

Logs

TypeError: object of type 'NoneType' has no len()

Environment

- TTS Version 0.7.1
- PyTorch Version 1.10
- Python Version 3.8
- OS AWS Linux
- Cuda/Cudnn 11.3/8.4
- AWS ml.g4dn.xlarge Cloud's GPU
- source
- I use the AWS SageMaker Studio to train the TTS Model

Additional context

No response

p0p4k commented 2 years ago

Hello, please re-paste your entire traceback error, since what I see is just ipython/ultratb.py error, I need to find out where it originates from TTS repo. Thanks.

bariscankurtkaya commented 2 years ago

Okay, I will add it immediately.

bariscankurtkaya commented 2 years ago

Hello @p0p4k Unfortunately I could not reproduce the error. I am currently getting this error.


IndexError                                Traceback (most recent call last)
/opt/conda/lib/python3.8/site-packages/trainer/trainer.py in fit(self)
   1491         try:
-> 1492             self._fit()
   1493             if self.args.rank == 0:

/opt/conda/lib/python3.8/site-packages/trainer/trainer.py in _fit(self)
   1475             if not self.skip_train_epoch:
-> 1476                 self.train_epoch()
   1477             if self.config.run_eval:

/opt/conda/lib/python3.8/site-packages/trainer/trainer.py in train_epoch(self)
   1254         for cur_step, batch in enumerate(self.train_loader):
-> 1255             _, _ = self.train_step(batch, batch_num_steps, cur_step, loader_start_time)
   1256             loader_start_time = time.time()

/opt/conda/lib/python3.8/site-packages/trainer/trainer.py in train_step(self, batch, batch_n_steps, step, loader_start_time)
   1087             # training with a single optimizer
-> 1088             outputs, loss_dict_new, step_time = self._optimize(
   1089                 batch,

/opt/conda/lib/python3.8/site-packages/trainer/trainer.py in _optimize(self, batch, model, optimizer, scaler, criterion, scheduler, config, optimizer_idx, step_optimizer, num_optimizers)
    974             else:
--> 975                 outputs, loss_dict = self._model_train_step(batch, model, criterion)
    976 

/opt/conda/lib/python3.8/site-packages/trainer/trainer.py in _model_train_step(batch, model, criterion, optimizer_idx)
    930             return model.module.train_step(*input_args)
--> 931         return model.train_step(*input_args)
    932 

/opt/conda/lib/python3.8/site-packages/TTS/tts/models/tacotron2.py in train_step(self, batch, criterion)
    326         aux_input = {"speaker_ids": speaker_ids, "d_vectors": d_vectors}
--> 327         outputs = self.forward(text_input, text_lengths, mel_input, mel_lengths, aux_input)
    328 

/opt/conda/lib/python3.8/site-packages/TTS/tts/models/tacotron2.py in forward(self, text, text_lengths, mel_specs, mel_lengths, aux_input)
    205         # B x mel_dim x T_out -- B x T_out//r x T_in -- B x T_out//r
--> 206         decoder_outputs, alignments, stop_tokens = self.decoder(encoder_outputs, mel_specs, input_mask)
    207         # sequence masking

/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1129                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1130             return forward_call(*input, **kwargs)
   1131         # Do not call functions when jit is used

/opt/conda/lib/python3.8/site-packages/TTS/tts/layers/tacotron/tacotron2.py in forward(self, inputs, memories, mask)
    320             memory = memories[len(outputs)]
--> 321             decoder_output, attention_weights, stop_token = self.decode(memory)
    322             outputs += [decoder_output.squeeze(1)]

/opt/conda/lib/python3.8/site-packages/TTS/tts/layers/tacotron/tacotron2.py in decode(self, memory)
    264         # self.query and self.attention_rnn_cell_state : B x D_attn_rnn
--> 265         self.query, self.attention_rnn_cell_state = self.attention_rnn(
    266             query_input, (self.query, self.attention_rnn_cell_state)

/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1129                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1130             return forward_call(*input, **kwargs)
   1131         # Do not call functions when jit is used

/opt/conda/lib/python3.8/site-packages/torch/nn/modules/rnn.py in forward(self, input, hx)
   1188 
-> 1189         ret = _VF.lstm_cell(
   1190             input, hx,

/opt/conda/lib/python3.8/site-packages/apex/amp/wrap.py in wrapper(*args, **kwargs)
     20                 if utils.should_cache(args[i]):
---> 21                     args[i] = utils.cached_cast(cast_fn, args[i], handle.cache)
     22             for k in kwargs:

/opt/conda/lib/python3.8/site-packages/apex/amp/utils.py in cached_cast(cast_fn, x, cache)
     96             # Make sure x is actually cached_x's autograd parent.
---> 97             if cached_x.grad_fn.next_functions[1][0].variable is not x:
     98                 raise RuntimeError("x and cache[x] both require grad, but x is not "

IndexError: tuple index out of range

During handling of the above exception, another exception occurred:

OSError                                   Traceback (most recent call last)
<ipython-input-66-b6cd7ba69c0e> in <module>
----> 1 trainer.fit()

/opt/conda/lib/python3.8/site-packages/trainer/trainer.py in fit(self)
   1509                 os._exit(0)  # pylint: disable=protected-access
   1510         except BaseException:  # pylint: disable=broad-except
-> 1511             remove_experiment_folder(self.output_path)
   1512             traceback.print_exc()
   1513             sys.exit(1)

/opt/conda/lib/python3.8/site-packages/trainer/generic_utils.py in remove_experiment_folder(experiment_path)
     62     if not checkpoint_files:
     63         if fs.exists(experiment_path):
---> 64             fs.rm(experiment_path, recursive=True)
     65             logger.info(" ! Run is removed from %s", experiment_path)
     66     else:

/opt/conda/lib/python3.8/site-packages/fsspec/implementations/local.py in rm(self, path, recursive, maxdepth)
    145                 if osp.abspath(p) == os.getcwd():
    146                     raise ValueError("Cannot delete current working directory")
--> 147                 shutil.rmtree(p)
    148             else:
    149                 os.remove(p)

/opt/conda/lib/python3.8/shutil.py in rmtree(path, ignore_errors, onerror)
    720                     os.rmdir(path)
    721                 except OSError:
--> 722                     onerror(os.rmdir, path, sys.exc_info())
    723             else:
    724                 try:

/opt/conda/lib/python3.8/shutil.py in rmtree(path, ignore_errors, onerror)
    718                 _rmtree_safe_fd(fd, path, onerror)
    719                 try:
--> 720                     os.rmdir(path)
    721                 except OSError:
    722                     onerror(os.rmdir, path, sys.exc_info())

OSError: [Errno 39] Directory not empty: '/root/tts_train_dir/run-July-01-2022_08+51AM-0000000'```
p0p4k commented 2 years ago

I think it's the same rm error from the other thread. Try that solution or restart your environment with a fresh install.

erogol commented 2 years ago

@bariscankurtkaya are you running this on Colab?

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.

codepharmer commented 10 months ago

@bariscankurtkaya are you running this on Colab?

@erogol I'm getting a similar issue running on Colab. Any advice or refence docs you can offer?

Internal Python error in the inspect module.
Below is the traceback from this internal error.

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/trainer/trainer.py", line 1833, in fit
    self._fit()
  File "/usr/local/lib/python3.10/dist-packages/trainer/trainer.py", line 1785, in _fit
    self.train_epoch()
    380     # first frame (from in to out) that looks different.
    381     if not is_recursion_error(etype, value, records):
--> 382         return len(records), 0
    383 
    384     # Select filename, lineno, func_name to track frames with
...
TypeError: object of type 'NoneType' has no len()
tarudesu commented 6 months ago

Is there any solutions for this? I'm facing with this on Colab too. @codepharmer @erogol

codepharmer commented 6 months ago

I think I ended up training a different model... (you can see the options in the recipes directory)