alphacep / vosk-tts

Text To Speech Synthesis with Vosk
Apache License 2.0
118 stars 18 forks source link

TypeError: generator_loss() missing 1 required positional argument: 'disc_generated_outputs' #18

Closed Serdcoff closed 7 months ago

Serdcoff commented 7 months ago

Hi there. Thank you very much for this TTS. I will talk about training own data.

At first I noticed mistake in step 0

Clone this repository and build monotonic align

git clone https://github.com/alphacep/vosk-tts cd vosk-tts/training cd monotonic_align python setup.py build_ext --inplace cd .. you missed mkdir monotonic_align and then you can run python setup.py build_ext --inplace

at second when I try to run python3 train_finetune.py It gives me some errors:

File "train_finetune.py", line 497, in main() File "train_finetune.py", line 58, in main mp.spawn(run, nprocs=n_gpus, args=(n_gpus, hps,))

torch/multiprocessing/spawn.py", line 240, in spawn return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')

torch/multiprocessing/spawn.py", line 198, in start_processes while not context.join(): torch/multiprocessing/spawn.py", line 160, in join raise ProcessRaisedException(msg, error_index, failed_process.pid) torch.multiprocessing.spawn.ProcessRaisedException:

-- Process 0 terminated with the following error: Traceback (most recent call last):

torch/multiprocessing/spawn.py", line 69, in _wrap fn(i, *args) vosk-tts/training/train_finetune.py", line 241, in run train_and_evaluate(rank, epoch, hps, [net_g, net_d, net_dur_disc], [optim_g, optim_d, optim_dur_disc], vosk-tts/training/train_finetune.py", line 358, in train_and_evaluate loss_gen, losses_gen = generator_loss(y_d_hat_g) TypeError: generator_loss() missing 1 required positional argument: 'disc_generated_outputs'

What I did wrong?

My System: Description: Ubuntu 22.04.4 LTS Release: 22.04 Codename: jammy

Env: Conda

Python: 3.8.18 Pytorch version 1.13.1 (+cu117)

I get the requirements.txt from here https://github.com/FENRlR/MB-iSTFT-VITS2/blob/main/requirements.txt

Voice data was recorded with same text of db-finetune folder. Wavs also 22050Hz mono.

Will be glad to hear from you comments.

nshmyrev commented 7 months ago

Should be fixed in https://github.com/alphacep/vosk-tts/commit/4953e108ef803dd36f5bdb49a90b3b51d4587233, please update and try again

Serdcoff commented 7 months ago

Thanx, will try and let you know

Serdcoff commented 7 months ago

The begging was good. But first in Epoch 35 was a error. I tried to make it run one more time. Next error was in Epoch 21.

[E ProcessGroupGloo.cpp:137] Rank 1 successfully reached monitoredBarrier, but received errors while waiting for send/recv from rank 0. Please check rank 0 logs for faulty rank. /torch/autograd/init.py:197: UserWarning: Error detected in torch::autograd::AccumulateGrad. Traceback of forward call that caused the error: File "", line 1, in File "/miniconda3/envs/vosktts/lib/python3.8/multiprocessing/spawn.py", line 116, in spawn_main exitcode = _main(fd, parent_sentinel) File "/miniconda3/envs/vosktts/lib/python3.8/multiprocessing/spawn.py", line 129, in _main return self._bootstrap(parent_sentinel) File "/miniconda3/envs/vosktts/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap self.run() File "/miniconda3/envs/vosktts/lib/python3.8/multiprocessing/process.py", line 108, in run self._target(*self._args, *self._kwargs) File "/miniconda3/envs/vosktts/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 69, in _wrap fn(i, args) File "/vosk-tts/training/train_finetune.py", line 228, in run net_dur_disc = DDP(net_dur_disc, device_ids=[rank], find_unused_parameters=True) File "/miniconda3/envs/vosktts/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 667, in init self._ddp_init_helper( File "/miniconda3/envs/vosktts/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 733, in _ddp_init_helper self.reducer = dist.Reducer( File "/miniconda3/envs/vosktts/lib/python3.8/site-packages/torch/fx/traceback.py", line 57, in format_stack return traceback.format_stack() (Triggered internally at /opt/conda/conda-bld/pytorch_1670525541702/work/torch/csrc/autograd/python_anomaly_mode.cpp:114.) Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass Traceback (most recent call last): File "train_finetune.py", line 502, in main() File "train_finetune.py", line 60, in main mp.spawn(run, nprocs=n_gpus, args=(n_gpus, hps,)) File "/miniconda3/envs/vosktts/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 240, in spawn return start_processes(fn, args, nprocs, join, daemon, start_method='spawn') File "/miniconda3/envs/vosktts/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 198, in start_processes while not context.join(): File "/miniconda3/envs/vosktts/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 160, in join raise ProcessRaisedException(msg, error_index, failed_process.pid) torch.multiprocessing.spawn.ProcessRaisedException:

-- Process 0 terminated with the following error: Traceback (most recent call last): File "/miniconda3/envs/vosktts/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1120, in _try_get_data data = self._data_queue.get(timeout=timeout) File "/miniconda3/envs/vosktts/lib/python3.8/queue.py", line 179, in get self.not_empty.wait(remaining) File "/miniconda3/envs/vosktts/lib/python3.8/threading.py", line 306, in wait gotit = waiter.acquire(True, timeout) File "/miniconda3/envs/vosktts/lib/python3.8/site-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler _error_if_any_worker_fails() RuntimeError: DataLoader worker (pid 67610) is killed by signal: Segmentation fault.

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/miniconda3/envs/vosktts/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 69, in _wrap fn(i, *args) File "vosk-tts/training/train_finetune.py", line 242, in run train_and_evaluate(rank, epoch, hps, [net_g, net_d, net_dur_disc], [optim_g, optim_d, optim_dur_disc], File "vosk-tts/training/train_finetune.py", line 283, in train_and_evaluate for batch_idx, (x, x_lengths, spec, spec_lengths, y, y_lengths, speakers) in enumerate(loader): File "/miniconda3/envs/vosktts/lib/python3.8/site-packages/tqdm/std.py", line 1181, in iter for obj in iterable: File "/miniconda3/envs/vosktts/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 628, in next data = self._next_data() File "/miniconda3/envs/vosktts/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1316, in _next_data idx, data = self._get_data() File "/miniconda3/envs/vosktts/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1272, in _get_data success, data = self._try_get_data() File "miniconda3/envs/vosktts/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1133, in _try_get_data raise RuntimeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str)) from e RuntimeError: DataLoader worker (pid(s) 67610) exited unexpectedly

miniconda3/envs/vosktts/lib/python3.8/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 32 leaked semaphore objects to clean up at shutdown warnings.warn('resource_tracker: There appear to be %d '

nshmyrev commented 7 months ago

Well, you can probably already try existing checkpoint after epoch 35.

As for segmentatino fault in data loader, it might be due to insufficient memory. You can probably reduce number of data loaders here:

https://github.com/alphacep/vosk-tts/blob/master/training/train_finetune.py#L98

change 8 to 4

Serdcoff commented 7 months ago

SO, I tried to change the number to 4 from 8. Error was in Epoch 39. By the way I have 2 3090 with 24G on the board. At all 48G in multi.

As you sad, next step is: python3 export_onnx.py

but where is the export_onnx.py file? Is it onnx_export.py?

If yes. here is a error: .py Using mel posterior encoder for VITS2 256 2 Multi-band iSTFT VITS2 Traceback (most recent call last): File "onnxexport.py", line 55, in = utils.load_checkpoint(PATH_TO_MODEL, net_g, None) File "/vosk-tts/training/utils.py", line 19, in load_checkpoint assert os.path.isfile(checkpoint_path) AssertionError

Serdcoff commented 7 months ago

Hi there one more time. Tried to make it all from scratch. Was lucky to get till 100 Epoch. 2 times same error after 100 Epoch Traceback (most recent call last): File "train_finetune.py", line 502, in main() File "train_finetune.py", line 60, in main mp.spawn(run, nprocs=n_gpus, args=(n_gpus, hps,)) File "venv/vosk-tts38/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 240, in spawn return start_processes(fn, args, nprocs, join, daemon, start_method='spawn') File "venv/vosk-tts38/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 198, in start_processes while not context.join(): File "venv/vosk-tts38/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 160, in join raise ProcessRaisedException(msg, error_index, failed_process.pid) torch.multiprocessing.spawn.ProcessRaisedException:

-- Process 1 terminated with the following error: Traceback (most recent call last): File "venv/vosk-tts38/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 69, in _wrap fn(i, *args) File "/vosk-tts/training/train_finetune.py", line 253, in run evaluate(hps, net_g, eval_loader, writer_eval) UnboundLocalError: local variable 'eval_loader' referenced before assignment

Could you please check?

p.s. Also here https://github.com/alphacep/vosk-tts/blob/4953e108ef803dd36f5bdb49a90b3b51d4587233/training/onnx_export.py#L30 Is a G_4300.pth, but after training max value of file is G_750.pth

Now I'm trying to train with your dataset to check. Cause with my result of trining no voice, only robotic noise.

UPDATE: Same thing was when training on your dataset. I will attach the wav file with result.

out.zip

Serdcoff commented 6 months ago

Well, you can probably already try existing checkpoint after epoch 35.

As for segmentatino fault in data loader, it might be due to insufficient memory. You can probably reduce number of data loaders here:

https://github.com/alphacep/vosk-tts/blob/master/training/train_finetune.py#L98

change 8 to 4

Hi there. Is any updates for my las post?

nshmyrev commented 6 months ago

You can simply export the last checkpoint now and try, it will work

Serdcoff commented 6 months ago

Hi there more time. One more time thank you for all your amazing job. That was my fault, I was inattentive. I trained not from the trained model checkpoint. Now I'm getting train. I was made the folder pretrained inside the training folder and copied there your pre-trained model. Hope this will change the result. Will attach later the result. Update: The result was amazing. Everything works perfect. Thank you one more time.

So, another question. What if I will train your pre-trained model to another language?

Serdcoff commented 6 months ago

You can simply export the last checkpoint now and try, it will work

You were right - it works PERFECT! So, another question. What if I will train your pre-trained model to another language?

nshmyrev commented 6 months ago

It should work for other languages, but you need to train the model and we don't have very detailed docs yet. You can also look on piper.

Serdcoff commented 6 months ago

It should work for other languages, but you need to train the model and we don't have very detailed docs yet. You can also look on piper.

I will read about piper near time. I hope I will find desired language. Thank you. The model must be pre-traained from English dataset or I can train yours with my data in English from the check point? Also how can I name the VOICE which was trained by me when adding it? As I understand I can train at once not one voice pack (it depend to metadata.csv file). Is multilingual model can mix 2 languages in one sentence?

nshmyrev commented 6 months ago

Depends on the language you are interested in

Serdcoff commented 6 months ago

Depends on the language you are interested in

So, at first I want the TTS mix 2 languages EN and RU if in sentence present English words such as company name or trade mark. Also how can I resolve the problem with numbers. Do I need to implement something like https://github.com/savoirfairelinux/num2words?

Serdcoff commented 6 months ago

Hello dear Nikolay. I have a question about dictionary of your model. The format of the dictionary close to CMU Dict. If I will use in this recipe format of espeak-ng phonemes dictionary it worx or not? If not, how can I convert my dictionary txt file with words to your format if the language is not English and Russian?