coqui-ai / TTS

πŸΈπŸ’¬ - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
http://coqui.ai
Mozilla Public License 2.0
35.48k stars 4.33k forks source link

Directory Not Empty #1696

Closed bariscankurtkaya closed 2 years ago

bariscankurtkaya commented 2 years ago

Describe the bug

I trained the GlowTTS Model with TTS library v0.6 but after updates, unfortunately, I can't train the TTS Model because It returns a "Directory not an empty" error for the model saving directory. After that, I deleted the whole directory and start to train the TTS model after trainer.fit() function It creates a "run-June-27-2022 xxxxx" directory and still returns an error to me. I didn't understand why this is not working because the TTS library creates its own directory to save the TTS Models but after the creation, it returns an error.

Screen Shot 2022-06-27 at 10 32 36

To Reproduce

This is my configurations: from TTS.tts.configs.glow_tts_config import GlowTTSConfig config = GlowTTSConfig( batch_size=32, eval_batch_size=16, num_loader_workers=4, num_eval_loader_workers=4, run_eval=True, test_delay_epochs=-1, epochs=1000, text_cleaner="phoneme_cleaners", use_phonemes=True, phoneme_language="en-us", phoneme_cache_path=os.path.join(output_path, "phoneme_cache"), print_step=25, print_eval=False, mixed_precision=True, output_path=output_path, datasets=[dataset_config], save_step=10000, )

and after that I also use from TTS.tts.models.glow_tts import GlowTTS model = GlowTTS(config, ap, tokenizer, speaker_manager=None)

and the last touch is trainer.fit() then sometimes It start to train but after 100-150 steps It returns an error 😭

Expected behavior

It should start to train and save the models to the directory.

Logs

Directory not empty: '/root/tts_train_dir/run-June-27-2022_xxxxx'

Environment

- TTS Version 0.7.1
- PyTorch Version 1.10
- Python Version 3.8
- OS AWS Linux
- Cuda/Cudnn 11.3/8.4
- AWS ml.g4dn.xlarge Cloud's GPU
- source
- I use the AWS SageMaker Studio to train the TTS Model

Additional context

No response

p0p4k commented 2 years ago

Hi, first you might want to update to latest trainer from pypi. Second you said TTSv0.6 in text, but in environment you said TTSv0.7.1, just check that out. Third, in play of fs.rm(...) in trainer.py, you can try to print out what it is trying to exactly delete. Then, if I were you and if I see that nothing too risky is being deleted, I would try to replace 'rm' with 'rmtree' in trainer.py file line 64, then os can delete non-empty folders as well. That's how I would being to try to debug this.. I am not exactly sure if that's the solution, but it could be a good start. Good luck!

bariscankurtkaya commented 2 years ago

Thank you for your fast reply. I just wanted to tell. I trained with TTSv0.6 before but when I tried to train with TTSv0.7 It creates an error. On the other hand, I will try the "rmtree" advice and inform you back.

bariscankurtkaya commented 2 years ago

Hi, first you might want to update to latest trainer from pypi. Second you said TTSv0.6 in text, but in environment you said TTSv0.7.1, just check that out. Third, in play of fs.rm(...) in trainer.py, you can try to print out what it is trying to exactly delete. Then, if I were you and if I see that nothing too risky is being deleted, I would try to replace 'rm' with 'rmtree' in trainer.py file line 64, then os can delete non-empty folders as well. That's how I would being to try to debug this.. I am not exactly sure if that's the solution, but it could be a good start. Good luck!

Thank you for your reply I think that was AWS SageMaker's issue. It was fixed without changing anything. So thank you.