huangkun1985 commented 1 month ago

Please vote for Cantonese!!!

SWivid commented 1 month ago

b（￣▽￣）d　

wong813 commented 1 month ago

Yes Cantonese needed!!!

ringolam commented 3 weeks ago

Strongly request to have Cantonese language

elbartohub commented 3 weeks ago

Yes, one of the most powerful language on Earth

pebblehack commented 3 weeks ago

It's not an issue but- Cantonese would be neat.

indiejoseph commented 2 days ago

+1

chau9ho commented 2 days ago

I have fine tuned a Cantonese model with F5-TTS

Dataset Details

Dataset: Common Voice Corpus 19.0
Validated Hours: 109 hours
Training Steps: ~190K steps
Current Results

The model can generate Cantonese speech, but has following issues:
Strong Mandarin (Putonghua) accent
Not natural Cantonese pronunciation
Some characters still pronounced in Mandarin despite being in vocab.txt
Numeric characters not pronounced

Training Configuration

{ "exp_name": "F5TTS_Base", "learning_rate": 7.5e-05, "batch_size_per_gpu": 12000, "batch_size_type": "frame", "max_samples": 64, "grad_accumulation_steps": 4, "max_grad_norm": 0, "epochs": 50, "num_warmup_updates": 2000, "save_per_updates": 20000, "last_per_steps": 500, "finetune": true, "file_checkpoint_train": "", "tokenizer_type": "char", "tokenizer_file": "", "mixed_precision": "fp16", "logger": "tensorboard" }

Questions

Character/Vocab Issues:

Why are some characters skipped despite being in vocab.txt? How to improve numeric character handling? Best way to enforce Cantonese pronunciation for all characters?

Training Concerns: Would more training steps help with pronunciation? Is the dataset quality affecting accent/pronunciation? How to reduce the Mandarin accent influence?

Looking for: Solutions for character skipping issues Methods to improve numeric character handling Community experience with similar issues

SWivid / F5-TTS

Cantonese needed! #37

Dataset Details

Current Results

Training Configuration