Closed MMMMichaelzhang closed 2 years ago
You can just change the number of domains and fine-tune the model. It will add one more projection to the style encoder and mapping network while keeping the original ones, and the same for the discriminatory. The generator is independent of the number of domains.
When I fine-tune the model,I add 1 speaker,then I change num_domains=1,I got error. RuntimeError: output with shape [1, 512, 1, 1] doesn't match the broadcast shape [32, 512, 1, @yl4579
It has nothing to do with the added speaker, it just happens that the size of your data mod the batch size is 1, see #42
what should i do to solve this problem? @yl4579
It has nothing to do with the added speaker, it just happens that the size of your data mod the batch size is 1, see #42
I set my data to be exactly a multiple of the batch size,and set num_domains=1,but still got nan.@yl4579 train/real : 0.4016 train/fake : 0.3439 train/reg : 0.0005 train/real_adv_cls: nan train/con_reg : 0.0278 train/adv : 1.5941 train/sty : 0.2312 train/ds : 0.0003 train/cyc : 1.0075 train/norm : 2.6099 train/asr : 0.0490 train/f0 : 0.2077 train/adv_cls : nan eval/real : 0.6233 eval/fake : 0.6014 eval/reg : 0.0000 eval/real_adv_cls: nan eval/con_reg : 0.0000 eval/adv : 0.7952 eval/sty : 0.1543 eval/ds : 0.0001 eval/cyc : 0.9594 eval/norm : 1.9790 eval/asr : 0.0442 eval/f0 : 0.3179 eval/adv_cls : nan
When I fine-tuned the model, I had 20 speakers and the model was epoch_00300.pth, now I want to add 1 person, how should I set up? I changed the pretrained_model in config.yml, and then num_domains=21? can you tell me how to do it,thanks