myshell-ai / MeloTTS

High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
MIT License
4.84k stars 631 forks source link

Finetuning model with customized dataset #173

Open lkim0402 opened 3 months ago

lkim0402 commented 3 months ago

Hello! I want to use MeloTTS to output in a voice from my customized dataset. I'm fairly new to this, and I'm just mega confused about how I could use the checkpoints or pretrained models and finetune it on top of my own data.

Right now I just have this script which I just run with model.py.

import torch
from melo.api import TTS

# Speed is adjustable
speed = 1.0
device = 'cpu' # or cuda:0

text = "안녕하세요! 오늘은 날씨가 정말 좋네요."
checkpoint_path = '/home/tts/MeloTTS/melo/configs/checkpoint.pth'
config_path = '/home/tts/MeloTTS/melo/data/kor_config.json'

model = TTS(language='KR', device=device, config_path=config_path, ckpt_path=checkpoint_path)
speaker_ids = model.hps.data.spk2id

For the kor_config, I am using the korean config file, and in there I have set my code to

"training_files": "/home/tts/MeloTTS/melo/data/train.list",
 "validation_files": "/home/tts/MeloTTS/melo/data/val.list",

The checkpoint is the korean checkpoint to the korean checkpoint in download_utils.py.

How can I now fine-tune the model on my own data?