rhasspy / piper

A fast, local neural text to speech system
https://rhasspy.github.io/piper-samples/
MIT License
5.71k stars 408 forks source link

Important note for dataset generation! #147

Open ican24 opened 1 year ago

ican24 commented 1 year ago

Hi, "python -m piper_train.preprocess" is not enable to treat double quote ("), maybe single quote too. The preprocess is ended with Ok, but in the stage of training it's stopped by "RuntimeError: CUDA out of memory. ..." error. Be careful! I lost a hug time to discover and fix this problem.

rmcpantoja commented 1 year ago

Hi @ican24, You can down the batch size, according to your dataset size.

ican24 commented 1 year ago

Yes, surely, but it significantly slows the machine learning: 4-8 times. More easy to remove quotes.

trunglebka commented 1 year ago

Could you please elaborate on the problem, our dataset has many quote " and I'm not sure what is the problem with it?

ican24 commented 1 year ago

I am trying to fix the problem with commands

sed -i 's/"//g' metadata.csv
sed -i 's/”//g' metadata.csv
sed -i 's/“//g' metadata.csv

The quotes are meaningless in TTS. Maybe you need to add other commands too in your case. Those fixed my problem and I went ahead.

trunglebka commented 1 year ago

I mean why do quotes cause Cuda OOM, is there parsing problem with training code but not preprocessing?...

ican24 commented 1 year ago

It is hard to say. I am not Deep Learning programmer. It needs to carefully analyze the existing code.

dttvn0010 commented 5 months ago

If your meta csv file has a line with only one quote, the csv reader will continue to read the next lines until it find the next quote, so several lines will be merged into one, creating a huge line and cause memory overflow. This should be fixed be setting quotechar to None in this line: https://github.com/rhasspy/piper/blob/master/src/python/piper_train/preprocess.py#L421 i.e reader = csv.reader(csv_file, delimiter="|", quotechar=None)