coqui-ai / TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
http://coqui.ai
Mozilla Public License 2.0
33.99k stars 4.13k forks source link

PermissionError: [WinError 32] The process cannot access the file because it is being used by another process. #3657

Closed aryamanstha closed 5 months ago

aryamanstha commented 6 months ago

Describe the bug

I want to fine-tune the xttsv2 for the nepali language.

i follow the steps for recipes in xttsv2

https://github.com/coqui-ai/TTS/tree/dev/recipes/ljspeech/xtts_v2

I created the formatted dataset for the TTS model and created the custom formatter as given below: image

The metadata.txt file is shown below: image

Also I created the custom formatter for the dataset given below: image

When I fit the trainer instance using trainer.fit(), I encounter the following problem:

{ "name": "PermissionError", "message": "[WinError 32] The process cannot access the file because it is being used by another process: 'd:/FineTuning TTS model/GPT_XTTS_v2.0-April-01-2024_04+58PM-0000000\\trainer_0_log.txt'", "stack": "--------------------------------------------------------------------------- AssertionError Traceback (most recent call last) File c:\Users\ASUS\AppData\Local\Programs\Python\Python310\lib\site-packages\trainer\trainer.py:1833, in Trainer.fit(self) 1832 try: -> 1833 self._fit() 1834 if self.args.rank == 0:

File c:\Users\ASUS\AppData\Local\Programs\Python\Python310\lib\site-packages\trainer\trainer.py:1787, in Trainer._fit(self) 1786 if self.config.run_eval: -> 1787 self.eval_epoch() 1788 if epoch >= self.config.test_delay_epochs and self.args.rank <= 0:

File c:\Users\ASUS\AppData\Local\Programs\Python\Python310\lib\site-packages\trainer\trainer.py:1628, in Trainer.eval_epoch(self) 1626 if self.eval_loader is None: 1627 self.eval_loader = ( -> 1628 self.get_eval_dataloader( 1629 self.training_assets, 1630 self.eval_samples, 1631 verbose=True, 1632 ) 1633 if self.config.run_eval 1634 else None 1635 ) 1637 torch.set_grad_enabled(False)

File c:\Users\ASUS\AppData\Local\Programs\Python\Python310\lib\site-packages\trainer\trainer.py:990, in Trainer.get_eval_dataloader(self, training_assets, samples, verbose) 988 return loader --> 990 return self._get_loader( 991 self.model, 992 self.config, 993 training_assets, 994 True, 995 samples, 996 verbose, 997 self.num_gpus, 998 )

File c:\Users\ASUS\AppData\Local\Programs\Python\Python310\lib\site-packages\trainer\trainer.py:913, in Trainer._get_loader(self, model, config, assets, is_eval, samples, verbose, num_gpus) 909 loader = model.get_data_loader( 910 config=config, assets=assets, is_eval=is_eval, samples=samples, verbose=verbose, num_gpus=num_gpus 911 ) --> 913 assert ( 914 len(loader) > 0 915 ), \" ❗ len(DataLoader) returns 0. Make sure your dataset is not empty or len(dataset) > 0. \" 916 return loader

AssertionError: ❗ len(DataLoader) returns 0. Make sure your dataset is not empty or len(dataset) > 0.

During handling of the above exception, another exception occurred:

PermissionError Traceback (most recent call last) Cell In[20], line 1 ----> 1 trainer.fit()

File c:\Users\ASUS\AppData\Local\Programs\Python\Python310\lib\site-packages\trainer\trainer.py:1860, in Trainer.fit(self) 1858 os._exit(1) # pylint: disable=protected-access 1859 except BaseException: # pylint: disable=broad-except -> 1860 remove_experiment_folder(self.output_path) 1861 traceback.print_exc() 1862 sys.exit(1)

File c:\Users\ASUS\AppData\Local\Programs\Python\Python310\lib\site-packages\trainer\generic_utils.py:77, in remove_experiment_folder(experiment_path) 75 if not checkpoint_files: 76 if fs.exists(experiment_path): ---> 77 fs.rm(experiment_path, recursive=True) 78 logger.info(\" ! Run is removed from %s\", experiment_path) 79 else:

File c:\Users\ASUS\AppData\Local\Programs\Python\Python310\lib\site-packages\fsspec\implementations\local.py:185, in LocalFileSystem.rm(self, path, recursive, maxdepth) 183 if osp.abspath(p) == os.getcwd(): 184 raise ValueError(\"Cannot delete current working directory\") --> 185 shutil.rmtree(p) 186 else: 187 os.remove(p)

File c:\Users\ASUS\AppData\Local\Programs\Python\Python310\lib\shutil.py:739, in rmtree(path, ignore_errors, onerror) 737 # can't continue even if onerror hook returns 738 return --> 739 return _rmtree_unsafe(path, onerror)

File c:\Users\ASUS\AppData\Local\Programs\Python\Python310\lib\shutil.py:617, in _rmtree_unsafe(path, onerror) 615 os.unlink(fullname) 616 except OSError: --> 617 onerror(os.unlink, fullname, sys.exc_info()) 618 try: 619 os.rmdir(path)

File c:\Users\ASUS\AppData\Local\Programs\Python\Python310\lib\shutil.py:615, in _rmtree_unsafe(path, onerror) 613 else: 614 try: --> 615 os.unlink(fullname) 616 except OSError: 617 onerror(os.unlink, fullname, sys.exc_info())

PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'd:/FineTuning TTS model/GPT_XTTS_v2.0-April-01-2024_04+58PM-0000000\\trainer_0_log.txt'" }

To Reproduce

No Reproduce

Expected behavior

No response

Logs

>> DVAE weights restored from: XTTS_v2.0_original_model_files/dvae.pth
 | > Found 2064 files in D:\FineTuning TTS model\TTSDataset
 > Training Environment:
 | > Backend: Torch
 | > Mixed precision: False
 | > Precision: float32
 | > Num. of CPUs: 8
 | > Num. of Torch Threads: 1
 | > Torch seed: 1
 | > Torch CUDNN: True
 | > Torch CUDNN deterministic: False
 | > Torch CUDNN benchmark: False
 | > Torch TF32 MatMul: False
 > Start Tensorboard: tensorboard --logdir=GPT_XTTS_v2.0-April-01-2024_04+58PM-0000000

 > Model has 518442047 parameters

 > EPOCH: 0/1000
 --> GPT_XTTS_v2.0-April-01-2024_04+58PM-0000000
 > Filtering invalid eval samples!!
 > Total eval samples after filtering: 0

Environment

TTS                       0.22.0
pytorch-lightning         2.2.1
pytorch-metric-learning   2.4.1
Python 3.10.0
OS: Windows
GPU: AMD Radeon(TM) Vega 8 Graphics
Pytorch installed using pip

Additional context

No response

lexkoro commented 6 months ago

AssertionError: ❗ len(DataLoader) returns 0. Make sure your dataset is not empty or len(dataset) > 0.

Validate your custom formatter works correctly.

aryamanstha commented 6 months ago

@lexkoro the formatter works correctly i guess. When i print the training samples and evaluation samples I obtain the following result: image

lexkoro commented 6 months ago

@aryamanstha Why do the audio_files have no extension?

aryamanstha commented 6 months ago

@aryamanstha Why do the audio_files have no extension?

The formatter doesn't place the extension in the audio_file name. However all the audio files have .wav extension. image

reb1302 commented 5 months ago

same issue : During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "", line 1, in File "C:\Users\AZPC\AppData\Local\Programs\Python\Python311\Lib\multiprocessing\spawn.py", line 122, in spawn_main exitcode = _main(fd, parent_sentinel) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\AZPC\AppData\Local\Programs\Python\Python311\Lib\multiprocessing\spawn.py", line 131, in _main prepare(preparation_data) File "C:\Users\AZPC\AppData\Local\Programs\Python\Python311\Lib\multiprocessing\spawn.py", line 244, in prepare _fixup_main_from_name(data['init_main_from_name']) File "C:\Users\AZPC\AppData\Local\Programs\Python\Python311\Lib\multiprocessing\spawn.py", line 268, in _fixup_main_from_name main_content = runpy.run_module(mod_name, ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "", line 226, in run_module File "", line 98, in _run_module_code File "", line 88, in _run_code File "E:\TTS\TTS\recipes\ljspeech\glow_tts\train_glowtts.py", line 90, in trainer.fit() File "C:\Users\AZPC\AppData\Local\Programs\Python\Python311\Lib\site-packages\trainer\trainer.py", line 1860, in fit remove_experiment_folder(self.output_path) File "C:\Users\AZPC\AppData\Local\Programs\Python\Python311\Lib\site-packages\trainer\generic_utils.py", line 77, in remove_experiment_folder fs.rm(experiment_path, recursive=True) File "C:\Users\AZPC\AppData\Local\Programs\Python\Python311\Lib\site-packages\fsspec\implementations\local.py", line 185, in rm shutil.rmtree(p) File "C:\Users\AZPC\AppData\Local\Programs\Python\Python311\Lib\shutil.py", line 787, in rmtree return _rmtree_unsafe(path, onerror) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\AZPC\AppData\Local\Programs\Python\Python311\Lib\shutil.py", line 634, in _rmtree_unsafe onerror(os.unlink, fullname, sys.exc_info()) File "C:\Users\AZPC\AppData\Local\Programs\Python\Python311\Lib\shutil.py", line 632, in _rmtree_unsafe os.unlink(fullname) PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'E:/TTS/TTS/recipes/ljspeech/glow_tts/run-April-03-2024_07+09PM-dbf1a08a\trainer_0_log.txt'

4ll0w3v1l commented 5 months ago

yep, i have the same issue, though on windows it's WinError 32, while on mac it gives AssertionError as in OP's traceback, before permission error

reb1302 commented 5 months ago

I have tried every possible way, it feels like a train code is creating a folder containing log files, and it cannot be changed, related to TensorBoard

reb1302 commented 5 months ago

It seems the issue was related to multiprocessing. I tried encapsulating the main code in train.py by defining it within a def main(): and it worked. here is an example:

import os
from trainer import Trainer, TrainerArgs
from TTS.tts.configs.glow_tts_config import GlowTTSConfig
from TTS.tts.configs.shared_configs import BaseDatasetConfig
from TTS.tts.datasets import load_tts_samples
from TTS.tts.models.glow_tts import GlowTTS
from TTS.tts.utils.text.tokenizer import TTSTokenizer
from TTS.utils.audio import AudioProcessor

def main():
    output_path = os.path.dirname(os.path.abspath(__file__))
    dataset_config = BaseDatasetConfig(
        formatter="ipa_format",
        path="E://model//DATASET//viet-tts",
        meta_file_train="E://model//DATASET//viet-tts//meta_data.tsv",
    )

    config = GlowTTSConfig(
        # Configuration parameters
    )

    ap = AudioProcessor.init_from_config(config)
    tokenizer, config = TTSTokenizer.init_from_config(config)
    train_samples, eval_samples = load_tts_samples(
        # Loading samples parameters
    )
    model = GlowTTS(config, ap, tokenizer, speaker_manager=None)
    trainer = Trainer(
        TrainerArgs(), config, output_path, model=model, train_samples=train_samples, eval_samples=eval_samples
    )

    trainer.fit()

if __name__ == '__main__':
    main()