NTT123 / light-speed

A modified VITS that utilizes phoneme duration's ground truth for better robustness
MIT License
115 stars 35 forks source link

About training #9

Closed realHoangHai closed 7 months ago

realHoangHai commented 7 months ago

Halo!

I am using your public dataset https://huggingface.co/datasets/ntt123/viet-tts-dataset for training And got this error

2024-04-05 14:36:50:     return forward_call(*args, **kwargs)
2024-04-05 14:36:50:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-04-05 14:36:50:   File "/data/light-speed/models.py", line 425, in forward
2024-04-05 14:36:50:     z_slice, ids_slice = commons.rand_slice_segments(
2024-04-05 14:36:51:                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-04-05 14:36:51:   File "/data/light-speed/commons.py", line 64, in rand_slice_segments
2024-04-05 14:36:51:     ret = slice_segments(x, ids_str, segment_size)
2024-04-05 14:36:51:           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-04-05 14:36:51:   File "/data/light-speed/commons.py", line 54, in slice_segments
2024-04-05 14:36:51:     ret[i] = x[i, :, idx_str:idx_end]
2024-04-05 14:36:51:     ~~~^^^
2024-04-05 14:36:51: RuntimeError: The expanded size of the tensor (32) must match the existing size (0) at non-singleton dimension 1.  Target sizes: [192, 32].  Tensor sizes: [192, 0]

Have you encountered this error before? Any solution can I get? Is the issue related to naming the dataset

The directory structure of my data is as follows:

Screenshot 2024-04-05 174334

Hope your reply!