Availability of the Colab notebook

I had the same experience trying to run the provided Colab notebook and decided on a simpler approach, as shown below. Each code block here represents a code block that you add in Colab, and I'm focusing only on training and not including inference, as the original Colab notebook has. Also, make sure that you select a GPU instance, otherwise this won't work.

#@title Mount Google Drive
from google.colab import drive
drive.mount('/content/drive')

#@title Install Dependencies
!python -m pip install so-vits-svc-fork

#@title Verify Dependencies
!svc --help

For this next step, I created a folder MyDrive/TTS/sovits in my Google Drive, hence it listed in the path. Inside the folder I then added my training data in the expected location dataset_raw/{speaker_id}/**/{wav_file}.{any_format} before running the following:

#@title Generate Config
!cd /content/drive/MyDrive/TTS/sovits && svc pre-resample && svc pre-config

This will create a few folders, and because I'm running on the free T4 Colab instance, I optimized my config file located in my Google Drive under MyDrive/TTS/sovits/configs/44k/config.json with the following (only showing the modified lines)

{
  "train": {
    "epochs": 201,
    "batch_size": 16
  }
}

If you made changes to the config file, save/upload them before proceeding. It's worth noting that using 16 as the batch_size keeps the VRAM stable at around 11.9 / 15.0 GB. Anything over will cause it to run OOM.

#@title Generate Hubert
!cd /content/drive/MyDrive/TTS/sovits && svc pre-hubert

#@title Start Training
!cd /content/drive/MyDrive/TTS/sovits && svc train

And that's it! For the training step, if you want to load the TensorBoard so that you have a visual representation of the training, you can use the following instead of the previous:

#@title Start Training
%load_ext tensorboard
%tensorboard --logdir /content/drive/MyDrive/TTS/sovits/logs/44k
!cd /content/drive/MyDrive/TTS/sovits && svc train

For my dataset with about 13 minutes of training audio, it took just over an hour to complete. And now that I understand how long it takes to run, I can increase the epochs parameter to match my desired tradeoff of training data quality. Hope it helps!

voicepaw / so-vits-svc-fork

Availability of the Colab notebook #1168

Describe the bug

1064 might be reducing the the simplicity unfortunately

To Reproduce

Additional context

Version

Platform

Code of Conduct

No Duplicate