Finetune - exit status 3221225477

fabianrossbach commented 2 weeks ago

Checks

[X] This template is only for usage issues encountered.
[X] I have thoroughly reviewed the project documentation but couldn't find information to solve my problem.
[X] I have searched for existing issues, including closed ones, and couldn't find a solution.
[X] I confirm that I am using English to submit this report in order to facilitate communication.

Environment Details

Hi,

i try to get the training running. Yesterday it was working fine, but I reinstalled after continue training wasnt working as expected.

Now i cant get this thing running at all. I get until the Train Data Tab and there it wont work at all.

The code from CMD (Im running CMD as admin)

vocab :  2545
Using logger: None
Loading dataset ...
Download Vocos from huggingface charactr/vocos-mel-24khz

Sorting with sampler... if slow, check whether dataset is provided with duration:   0%|          | 0/30863 [00:00<?, ?it/s]
Sorting with sampler... if slow, check whether dataset is provided with duration: 100%|##########| 30863/30863 [00:00<00:00, 2681043.11it/s]

Creating dynamic batches with 2400 audio frames per gpu:   0%|          | 0/30863 [00:00<?, ?it/s]
Creating dynamic batches with 2400 audio frames per gpu: 100%|##########| 30863/30863 [00:00<00:00, 3082481.35it/s]

Epoch 1/370:   0%|          | 0/6140 [00:00<?, ?step/s]Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\F5\F5-TTS\venv\Scripts\accelerate.exe\__main__.py", line 7, in <module>
  File "C:\F5\F5-TTS\venv\Lib\site-packages\accelerate\commands\accelerate_cli.py", line 48, in main
    args.func(args)
  File "C:\F5\F5-TTS\venv\Lib\site-packages\accelerate\commands\launch.py", line 1168, in launch_command
    simple_launcher(args)
  File "C:\F5\F5-TTS\venv\Lib\site-packages\accelerate\commands\launch.py", line 763, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['C:\\F5\\F5-TTS\\venv\\Scripts\\python.exe', 'src/f5_tts/train/finetune_cli.py', '--exp_name', 'F5TTS_Base', '--learning_rate', '1e-05', '--batch_size_per_gpu', '2400', '--batch_size_type', 'frame', '--max_samples', '64', '--grad_accumulation_steps', '1', '--max_grad_norm', '1', '--epochs', '370', '--num_warmup_updates', '1544', '--save_per_updates', '10000', '--last_per_steps', '772', '--dataset_name', 'german_tts', '--finetune', 'True', '--tokenizer', 'pinyin', '--log_samples', 'True', '--logger', 'wandb']' returned non-zero exit status 3221225477.
accelerate launch --mixed_precision=bf16 src/f5_tts/train/finetune_cli.py --exp_name F5TTS_Base --learning_rate 1e-05 --batch_size_per_gpu 1200 --batch_size_type frame --max_samples 64 --grad_accumulation_steps 1 --max_grad_norm 1 --epochs 370 --num_warmup_updates 1544 --save_per_updates 5000 --last_per_steps 772 --dataset_name german_tts --finetune True --tokenizer pinyin  --log_samples True --logger wandb

Steps to Reproduce

Start venv and finetune
Transcribe, Vocab Check, Prepare Data
Train Data -> Error

✔️ Expected Behavior

No response

❌ Actual Behavior

No response

SWivid commented 2 weeks ago

no clear idea, seems with memory issue. maybe check accelerate config first

SWivid commented 1 week ago

will close as no more info received~

SWivid / F5-TTS