Issue with Second stage inference: Config File and Checkpoint Size Mismatch

Hello,

Thank you for your excellent work on the lip2speech-unit project. I am currently trying to perform inference using the instructions provided. However, I encountered a problem related to the configuration file:

In your code, specifically the section: if os.path.isdir(a.checkpoint_file): config_file = os.path.join(a.checkpoint_file, 'config.json') else: config_file = os.path.join(os.path.split(a.checkpoint_file)[0], 'config2.json')

It seems the necessary config.json or config2.json files are not provided in the repository. To proceed, I downloaded a configuration file from the HiFi-GAN repository. However, when I attempt to run the inference, I encounter multiple size mismatch errors, particularly for layers like resblocks, conv_post, and dict.weight. Here is an example of the errors: size mismatch for resblocks.10.convs1.1.weight_g: copying a param with shape torch.Size([32, 1, 1]) from checkpoint, the shape in current model is torch.Size([8, 1, 1]). size mismatch for resblocks.10.convs1.1.weight_v: copying a param with shape torch.Size([32, 32, 7]) from checkpoint, the shape in current model is torch.Size([8, 8, 7]). ... Could you provide the correct configuration files or detailed guidance on how to modify the model or the configuration to avoid these size mismatches? Any help or detailed instructions would be greatly appreciated.

Thank you in advance!

choijeongsoo / lip2speech-unit

Issue with Second stage inference: Config File and Checkpoint Size Mismatch #9