Closed LanglyAdrian closed 1 year ago
The error was that I tried to downsample the wavs via the librosa.load('test.wav', sr=22050) method, which resulted in the bitrate changing from 16 to 32. The correct downsample method solved the problem.
The error was that I tried to downsample the wavs via the librosa.load('test.wav', sr=22050) method, which resulted in the bitrate changing from 16 to 32. The correct downsample method solved the problem.
Can you be more specific? I also encountered the same problem
@LanglyAdrian Can you tell me the method you fixed the problem?
ffmpeg or sox
ср, 5 июл. 2023 г., 17:25 CongLuong12 @.***>:
@LanglyAdrian https://github.com/LanglyAdrian Can you tell me the method you fixed the problem?
— Reply to this email directly, view it on GitHub https://github.com/jaywalnut310/vits/issues/132#issuecomment-1621870999, or unsubscribe https://github.com/notifications/unsubscribe-auth/A3PCL66PVZVIKGW5VFZFJK3XOV2N5ANCNFSM6AAAAAAVILGC74 . You are receiving this because you were mentioned.Message ID: @.***>
@LanglyAdrian Thank you. You are correct, the problem is downsample step. I have downsampled with torchaudio, which defaults to normalization. Therefore, the wav file is normalized 2 times. Then I used pydub library to downsample and it worked fine.
I downloaded the VCTK-corpus dataset, downsampled to 22050Hz and started training with default parameters (I only changed the batch size to 32). 130k has already passed, I wanted to hear the result, but when generating sound (G_1300000.pth) I get silence. This is fine? Has anyone experienced this?