Closed ductho9799 closed 1 year ago
I have implemented duration predictor training code. You can test it.
Hi, I will check and review the code ASAP.
I trained VITS2 with your code on my private data.
How are the results? Can you share some samples? No need to share the weights, just wav samples if possible, to see the output quality. Thanks!
Thanks for the samples. They do sound good. Can I ask if you transferred VITS-1 weights to VITS-2 or trained VITS-2 from scratch?
I trained VITS-2 from-scratch. Here is my configs: vits-2-configs.json. I trained it on 4 RTX 3090 24 GVRAM
Interesting! Can I add your samples on README of this repo? I still would advice to add discriminator and train the model. Also, would be great if you can turn on the other flags and check any improvement in the output? Thanks!
Thanks for your suggestions. I'm planning to train VITS-2 with the LJSpeech dataset next week. I will send you the checkpoint of LJSpeech and generated samples.
Hi, I updated the code with 2 discriminators; please check it if you are interested.
Thank you so much for updating the new discriminators. I will test and train with new discriminators. I'll share the result with you as soon as possible.
@ductho9799 hello. What was improved in speech? I'm curious just pronunciation or other characteristics of the voice?
@p0p4k @egorsmkv Hello, I trained a version of VITS-2 with the LJSpeech dataset. I share the weights, config, and audio samples of VITS-2 in VITS-2. Can you help me evaluate the quality of VITS-2 on LJSpeech dataset?
I trained VITS-2 with 390 epochs and the trained duration predictor with 200 epochs.
@ductho9799 change access of your drive file. Thanks.
Yes, try again it, please.
Thanks for sharing the checkpoints! Samples sound not bad! Can you train the latest code with duration discriminator and HIFIGan Discriminator (multiperiod disc) with nosdp?
I am booting a cloud GPU right now to train as well. I want to check if the duration discriminator is working or not. (no nan, inf values, etc)
Thanks for sharing the checkpoints! Samples sound not bad! Can you train the latest code with duration discriminator and HIFIGan Discriminator (multiperiod disc) with nosdp? I can train this config at the weekend.
If the training works well, I will share the checkpoints so you can continue to train on that; else will try to fix the code before weekend.
@ductho9799 checkpoints are on main page readme. Good luck!
I trained VITS-2 from-scratch. Here is my configs: vits-2-configs.json. I trained it on 4 RTX 3090 24 GVRAM
@ductho9799 Can you share file symbols.py, i trained in infore dataset but result not good. I used config like you. :((. All config, model and train.log in drive. Can you give me some advice? Thank you very much.
@ductho9799 have you tried with an external embedding extractor?
@ductho9799 have you tried with an external embedding extractor?
Did you mean bert-vit2s?
@ductho9799 Can you please share your symbols? Cause while trying to inference it, I am getting this error RuntimeError: Error(s) in loading state_dict for SynthesizerTrn: size mismatch for enc_p.emb.weight: copying a param with shape torch.Size([184, 192]) from checkpoint, the shape in current model is torch.Size([178, 192]).
Hello p0p4k! Your repository is very awesome. I trained VITS2 with your code on my private data. I have implemented duration predictor training code. You can test it.