b04901014 / MQTTS

MIT License
254 stars 36 forks source link

Error running inference from pre-trained models: inp = torch.cat([spkr, inp], 1) RuntimeError: Tensors must have same number of dimensions: got 4 and 3 #3

Open swatsw opened 1 year ago

swatsw commented 1 year ago

Full output from running batch inference with pre-trained models:

_IncompatibleKeys(missing_keys=[], unexpected_keys=['vocoder.quantizer.quantizer_modules.0.embedding.weight', 'vocoder.quantizer.quantizer_modules.1.embedding.weight', 'vocoder.quantizer.quantizer_modules.2.embedding.weight', 'vocoder.quantizer.quantizer_modules.3.embedding.weight', 'vocoder.generator.conv_pre.bias', 'vocoder.generator.conv_pre.weight', 'vocoder.generator.ups.0.bias', 'vocoder.generator.ups.0.weight', 'vocoder.generator.ups.1.bias', 'vocoder.generator.ups.1.weight', 'vocoder.generator.ups.2.bias', 'vocoder.generator.ups.2.weight', 'vocoder.generator.ups.3.bias', 'vocoder.generator.ups.3.weight', 'vocoder.generator.resblocks.0.convs1.0.bias', 'vocoder.generator.resblocks.0.convs1.0.weight', 'vocoder.generator.resblocks.0.convs1.1.bias', 'vocoder.generator.resblocks.0.convs1.1.weight', 'vocoder.generator.resblocks.0.convs1.2.bias', 'vocoder.generator.resblocks.0.convs1.2.weight', 'vocoder.generator.resblocks.0.convs2.0.bias', 'vocoder.generator.resblocks.0.convs2.0.weight', 'vocoder.generator.resblocks.0.convs2.1.bias', 'vocoder.generator.resblocks.0.convs2.1.weight', 'vocoder.generator.resblocks.0.convs2.2.bias', 'vocoder.generator.resblocks.0.convs2.2.weight', 'vocoder.generator.resblocks.1.convs1.0.bias', 'vocoder.generator.resblocks.1.convs1.0.weight', 'vocoder.generator.resblocks.1.convs1.1.bias', 'vocoder.generator.resblocks.1.convs1.1.weight', 'vocoder.generator.resblocks.1.convs1.2.bias', 'vocoder.generator.resblocks.1.convs1.2.weight', 'vocoder.generator.resblocks.1.convs2.0.bias', 'vocoder.generator.resblocks.1.convs2.0.weight', 'vocoder.generator.resblocks.1.convs2.1.bias', 'vocoder.generator.resblocks.1.convs2.1.weight', 'vocoder.generator.resblocks.1.convs2.2.bias', 'vocoder.generator.resblocks.1.convs2.2.weight', 'vocoder.generator.resblocks.2.convs1.0.bias', 'vocoder.generator.resblocks.2.convs1.0.weight', 'vocoder.generator.resblocks.2.convs1.1.bias', 'vocoder.generator.resblocks.2.convs1.1.weight', 'vocoder.generator.resblocks.2.convs1.2.bias', 'vocoder.generator.resblocks.2.convs1.2.weight', 'vocoder.generator.resblocks.2.convs2.0.bias', 'vocoder.generator.resblocks.2.convs2.0.weight', 'vocoder.generator.resblocks.2.convs2.1.bias', 'vocoder.generator.resblocks.2.convs2.1.weight', 'vocoder.generator.resblocks.2.convs2.2.bias', 'vocoder.generator.resblocks.2.convs2.2.weight', 'vocoder.generator.resblocks.3.convs1.0.bias', 'vocoder.generator.resblocks.3.convs1.0.weight', 'vocoder.generator.resblocks.3.convs1.1.bias', 'vocoder.generator.resblocks.3.convs1.1.weight', 'vocoder.generator.resblocks.3.convs1.2.bias', 'vocoder.generator.resblocks.3.convs1.2.weight', 'vocoder.generator.resblocks.3.convs2.0.bias', 'vocoder.generator.resblocks.3.convs2.0.weight', 'vocoder.generator.resblocks.3.convs2.1.bias', 'vocoder.generator.resblocks.3.convs2.1.weight', 'vocoder.generator.resblocks.3.convs2.2.bias', 'vocoder.generator.resblocks.3.convs2.2.weight', 'vocoder.generator.resblocks.4.convs1.0.bias', 'vocoder.generator.resblocks.4.convs1.0.weight', 'vocoder.generator.resblocks.4.convs1.1.bias', 'vocoder.generator.resblocks.4.convs1.1.weight', 'vocoder.generator.resblocks.4.convs1.2.bias', 'vocoder.generator.resblocks.4.convs1.2.weight', 'vocoder.generator.resblocks.4.convs2.0.bias', 'vocoder.generator.resblocks.4.convs2.0.weight', 'vocoder.generator.resblocks.4.convs2.1.bias', 'vocoder.generator.resblocks.4.convs2.1.weight', 'vocoder.generator.resblocks.4.convs2.2.bias', 'vocoder.generator.resblocks.4.convs2.2.weight', 'vocoder.generator.resblocks.5.convs1.0.bias', 'vocoder.generator.resblocks.5.convs1.0.weight', 'vocoder.generator.resblocks.5.convs1.1.bias', 'vocoder.generator.resblocks.5.convs1.1.weight', 'vocoder.generator.resblocks.5.convs1.2.bias', 'vocoder.generator.resblocks.5.convs1.2.weight', 'vocoder.generator.resblocks.5.convs2.0.bias', 'vocoder.generator.resblocks.5.convs2.0.weight', 'vocoder.generator.resblocks.5.convs2.1.bias', 'vocoder.generator.resblocks.5.convs2.1.weight', 'vocoder.generator.resblocks.5.convs2.2.bias', 'vocoder.generator.resblocks.5.convs2.2.weight', 'vocoder.generator.resblocks.6.convs1.0.bias', 'vocoder.generator.resblocks.6.convs1.0.weight', 'vocoder.generator.resblocks.6.convs1.1.bias', 'vocoder.generator.resblocks.6.convs1.1.weight', 'vocoder.generator.resblocks.6.convs1.2.bias', 'vocoder.generator.resblocks.6.convs1.2.weight', 'vocoder.generator.resblocks.6.convs2.0.bias', 'vocoder.generator.resblocks.6.convs2.0.weight', 'vocoder.generator.resblocks.6.convs2.1.bias', 'vocoder.generator.resblocks.6.convs2.1.weight', 'vocoder.generator.resblocks.6.convs2.2.bias', 'vocoder.generator.resblocks.6.convs2.2.weight', 'vocoder.generator.resblocks.7.convs1.0.bias', 'vocoder.generator.resblocks.7.convs1.0.weight', 'vocoder.generator.resblocks.7.convs1.1.bias', 'vocoder.generator.resblocks.7.convs1.1.weight', 'vocoder.generator.resblocks.7.convs1.2.bias', 'vocoder.generator.resblocks.7.convs1.2.weight', 'vocoder.generator.resblocks.7.convs2.0.bias', 'vocoder.generator.resblocks.7.convs2.0.weight', 'vocoder.generator.resblocks.7.convs2.1.bias', 'vocoder.generator.resblocks.7.convs2.1.weight', 'vocoder.generator.resblocks.7.convs2.2.bias', 'vocoder.generator.resblocks.7.convs2.2.weight', 'vocoder.generator.resblocks.8.convs1.0.bias', 'vocoder.generator.resblocks.8.convs1.0.weight', 'vocoder.generator.resblocks.8.convs1.1.bias', 'vocoder.generator.resblocks.8.convs1.1.weight', 'vocoder.generator.resblocks.8.convs1.2.bias', 'vocoder.generator.resblocks.8.convs1.2.weight', 'vocoder.generator.resblocks.8.convs2.0.bias', 'vocoder.generator.resblocks.8.convs2.0.weight', 'vocoder.generator.resblocks.8.convs2.1.bias', 'vocoder.generator.resblocks.8.convs2.1.weight', 'vocoder.generator.resblocks.8.convs2.2.bias', 'vocoder.generator.resblocks.8.convs2.2.weight', 'vocoder.generator.resblocks.9.convs1.0.bias', 'vocoder.generator.resblocks.9.convs1.0.weight', 'vocoder.generator.resblocks.9.convs1.1.bias', 'vocoder.generator.resblocks.9.convs1.1.weight', 'vocoder.generator.resblocks.9.convs1.2.bias', 'vocoder.generator.resblocks.9.convs1.2.weight', 'vocoder.generator.resblocks.9.convs2.0.bias', 'vocoder.generator.resblocks.9.convs2.0.weight', 'vocoder.generator.resblocks.9.convs2.1.bias', 'vocoder.generator.resblocks.9.convs2.1.weight', 'vocoder.generator.resblocks.9.convs2.2.bias', 'vocoder.generator.resblocks.9.convs2.2.weight', 'vocoder.generator.resblocks.10.convs1.0.bias', 'vocoder.generator.resblocks.10.convs1.0.weight', 'vocoder.generator.resblocks.10.convs1.1.bias', 'vocoder.generator.resblocks.10.convs1.1.weight', 'vocoder.generator.resblocks.10.convs1.2.bias', 'vocoder.generator.resblocks.10.convs1.2.weight', 'vocoder.generator.resblocks.10.convs2.0.bias', 'vocoder.generator.resblocks.10.convs2.0.weight', 'vocoder.generator.resblocks.10.convs2.1.bias', 'vocoder.generator.resblocks.10.convs2.1.weight', 'vocoder.generator.resblocks.10.convs2.2.bias', 'vocoder.generator.resblocks.10.convs2.2.weight', 'vocoder.generator.resblocks.11.convs1.0.bias', 'vocoder.generator.resblocks.11.convs1.0.weight', 'vocoder.generator.resblocks.11.convs1.1.bias', 'vocoder.generator.resblocks.11.convs1.1.weight', 'vocoder.generator.resblocks.11.convs1.2.bias', 'vocoder.generator.resblocks.11.convs1.2.weight', 'vocoder.generator.resblocks.11.convs2.0.bias', 'vocoder.generator.resblocks.11.convs2.0.weight', 'vocoder.generator.resblocks.11.convs2.1.bias', 'vocoder.generator.resblocks.11.convs2.1.weight', 'vocoder.generator.resblocks.11.convs2.2.bias', 'vocoder.generator.resblocks.11.convs2.2.weight', 'vocoder.generator.conv_post.bias', 'vocoder.generator.conv_post.weight', 'vocoder.generator.spkr_linear.0.weight', 'vocoder.generator.spkr_linear.0.bias', 'vocoder.generator.spkr_linear.2.weight', 'vocoder.generator.spkr_linear.2.bias']) Removing weight norm... Removing weight norm... Inferencing batch 1, total 41 baches. Traceback (most recent call last): File "/home/miniconda3/envs/mqtts/lib/python3.9/runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "/home/miniconda3/envs/mqtts/lib/python3.9/runpy.py", line 87, in _run_code exec(code, run_globals) File "/home/.vscode/extensions/ms-python.python-2023.4.1/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/main.py", line 39, in cli.main() File "/home/.vscode/extensions/ms-python.python-2023.4.1/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 430, in main run() File "/home/.vscode/extensions/ms-python.python-2023.4.1/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 284, in run_file runpy.run_path(target, run_name="main") File "/home/.vscode/extensions/ms-python.python-2023.4.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 321, in run_path return _run_module_code(code, init_globals, run_name, File "/home/.vscode/extensions/ms-python.python-2023.4.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 135, in _run_module_code _run_code(code, mod_globals, init_globals, File "/home/.vscode/extensions/ms-python.python-2023.4.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 124, in _run_code exec(code, run_globals) File "infer.py", line 81, in synthetic = model(i_wavs, i_phones) File "/home/miniconda3/envs/mqtts/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/home/MQTTS/tester.py", line 71, in forward synthetic = self.TTSdecoder.inference_topkp_sampling_batch(phone_features, speaker_embedding, phone_masks, prior=prior) File "/home/MQTTS/modules/wildttstransformer.py", line 88, in inference_topkp_sampling_batch inp = torch.cat([spkr, inp], 1) RuntimeError: Tensors must have same number of dimensions: got 4 and 3

swatsw commented 1 year ago

Tried to fix the issue by adding spkr = spkr.squeeze(1) to wildttstransformer.py(line 88), but then got following error:

_IncompatibleKeys(missing_keys=[], unexpected_keys=['vocoder.quantizer.quantizer_modules.0.embedding.weight', 'vocoder.quantizer.quantizer_modules.1.embedding.weight', 'vocoder.quantizer.quantizer_modules.2.embedding.weight', 'vocoder.quantizer.quantizer_modules.3.embedding.weight', 'vocoder.generator.conv_pre.bias', 'vocoder.generator.conv_pre.weight', 'vocoder.generator.ups.0.bias', 'vocoder.generator.ups.0.weight', 'vocoder.generator.ups.1.bias', 'vocoder.generator.ups.1.weight', 'vocoder.generator.ups.2.bias', 'vocoder.generator.ups.2.weight', 'vocoder.generator.ups.3.bias', 'vocoder.generator.ups.3.weight', 'vocoder.generator.resblocks.0.convs1.0.bias', 'vocoder.generator.resblocks.0.convs1.0.weight', 'vocoder.generator.resblocks.0.convs1.1.bias', 'vocoder.generator.resblocks.0.convs1.1.weight', 'vocoder.generator.resblocks.0.convs1.2.bias', 'vocoder.generator.resblocks.0.convs1.2.weight', 'vocoder.generator.resblocks.0.convs2.0.bias', 'vocoder.generator.resblocks.0.convs2.0.weight', 'vocoder.generator.resblocks.0.convs2.1.bias', 'vocoder.generator.resblocks.0.convs2.1.weight', 'vocoder.generator.resblocks.0.convs2.2.bias', 'vocoder.generator.resblocks.0.convs2.2.weight', 'vocoder.generator.resblocks.1.convs1.0.bias', 'vocoder.generator.resblocks.1.convs1.0.weight', 'vocoder.generator.resblocks.1.convs1.1.bias', 'vocoder.generator.resblocks.1.convs1.1.weight', 'vocoder.generator.resblocks.1.convs1.2.bias', 'vocoder.generator.resblocks.1.convs1.2.weight', 'vocoder.generator.resblocks.1.convs2.0.bias', 'vocoder.generator.resblocks.1.convs2.0.weight', 'vocoder.generator.resblocks.1.convs2.1.bias', 'vocoder.generator.resblocks.1.convs2.1.weight', 'vocoder.generator.resblocks.1.convs2.2.bias', 'vocoder.generator.resblocks.1.convs2.2.weight', 'vocoder.generator.resblocks.2.convs1.0.bias', 'vocoder.generator.resblocks.2.convs1.0.weight', 'vocoder.generator.resblocks.2.convs1.1.bias', 'vocoder.generator.resblocks.2.convs1.1.weight', 'vocoder.generator.resblocks.2.convs1.2.bias', 'vocoder.generator.resblocks.2.convs1.2.weight', 'vocoder.generator.resblocks.2.convs2.0.bias', 'vocoder.generator.resblocks.2.convs2.0.weight', 'vocoder.generator.resblocks.2.convs2.1.bias', 'vocoder.generator.resblocks.2.convs2.1.weight', 'vocoder.generator.resblocks.2.convs2.2.bias', 'vocoder.generator.resblocks.2.convs2.2.weight', 'vocoder.generator.resblocks.3.convs1.0.bias', 'vocoder.generator.resblocks.3.convs1.0.weight', 'vocoder.generator.resblocks.3.convs1.1.bias', 'vocoder.generator.resblocks.3.convs1.1.weight', 'vocoder.generator.resblocks.3.convs1.2.bias', 'vocoder.generator.resblocks.3.convs1.2.weight', 'vocoder.generator.resblocks.3.convs2.0.bias', 'vocoder.generator.resblocks.3.convs2.0.weight', 'vocoder.generator.resblocks.3.convs2.1.bias', 'vocoder.generator.resblocks.3.convs2.1.weight', 'vocoder.generator.resblocks.3.convs2.2.bias', 'vocoder.generator.resblocks.3.convs2.2.weight', 'vocoder.generator.resblocks.4.convs1.0.bias', 'vocoder.generator.resblocks.4.convs1.0.weight', 'vocoder.generator.resblocks.4.convs1.1.bias', 'vocoder.generator.resblocks.4.convs1.1.weight', 'vocoder.generator.resblocks.4.convs1.2.bias', 'vocoder.generator.resblocks.4.convs1.2.weight', 'vocoder.generator.resblocks.4.convs2.0.bias', 'vocoder.generator.resblocks.4.convs2.0.weight', 'vocoder.generator.resblocks.4.convs2.1.bias', 'vocoder.generator.resblocks.4.convs2.1.weight', 'vocoder.generator.resblocks.4.convs2.2.bias', 'vocoder.generator.resblocks.4.convs2.2.weight', 'vocoder.generator.resblocks.5.convs1.0.bias', 'vocoder.generator.resblocks.5.convs1.0.weight', 'vocoder.generator.resblocks.5.convs1.1.bias', 'vocoder.generator.resblocks.5.convs1.1.weight', 'vocoder.generator.resblocks.5.convs1.2.bias', 'vocoder.generator.resblocks.5.convs1.2.weight', 'vocoder.generator.resblocks.5.convs2.0.bias', 'vocoder.generator.resblocks.5.convs2.0.weight', 'vocoder.generator.resblocks.5.convs2.1.bias', 'vocoder.generator.resblocks.5.convs2.1.weight', 'vocoder.generator.resblocks.5.convs2.2.bias', 'vocoder.generator.resblocks.5.convs2.2.weight', 'vocoder.generator.resblocks.6.convs1.0.bias', 'vocoder.generator.resblocks.6.convs1.0.weight', 'vocoder.generator.resblocks.6.convs1.1.bias', 'vocoder.generator.resblocks.6.convs1.1.weight', 'vocoder.generator.resblocks.6.convs1.2.bias', 'vocoder.generator.resblocks.6.convs1.2.weight', 'vocoder.generator.resblocks.6.convs2.0.bias', 'vocoder.generator.resblocks.6.convs2.0.weight', 'vocoder.generator.resblocks.6.convs2.1.bias', 'vocoder.generator.resblocks.6.convs2.1.weight', 'vocoder.generator.resblocks.6.convs2.2.bias', 'vocoder.generator.resblocks.6.convs2.2.weight', 'vocoder.generator.resblocks.7.convs1.0.bias', 'vocoder.generator.resblocks.7.convs1.0.weight', 'vocoder.generator.resblocks.7.convs1.1.bias', 'vocoder.generator.resblocks.7.convs1.1.weight', 'vocoder.generator.resblocks.7.convs1.2.bias', 'vocoder.generator.resblocks.7.convs1.2.weight', 'vocoder.generator.resblocks.7.convs2.0.bias', 'vocoder.generator.resblocks.7.convs2.0.weight', 'vocoder.generator.resblocks.7.convs2.1.bias', 'vocoder.generator.resblocks.7.convs2.1.weight', 'vocoder.generator.resblocks.7.convs2.2.bias', 'vocoder.generator.resblocks.7.convs2.2.weight', 'vocoder.generator.resblocks.8.convs1.0.bias', 'vocoder.generator.resblocks.8.convs1.0.weight', 'vocoder.generator.resblocks.8.convs1.1.bias', 'vocoder.generator.resblocks.8.convs1.1.weight', 'vocoder.generator.resblocks.8.convs1.2.bias', 'vocoder.generator.resblocks.8.convs1.2.weight', 'vocoder.generator.resblocks.8.convs2.0.bias', 'vocoder.generator.resblocks.8.convs2.0.weight', 'vocoder.generator.resblocks.8.convs2.1.bias', 'vocoder.generator.resblocks.8.convs2.1.weight', 'vocoder.generator.resblocks.8.convs2.2.bias', 'vocoder.generator.resblocks.8.convs2.2.weight', 'vocoder.generator.resblocks.9.convs1.0.bias', 'vocoder.generator.resblocks.9.convs1.0.weight', 'vocoder.generator.resblocks.9.convs1.1.bias', 'vocoder.generator.resblocks.9.convs1.1.weight', 'vocoder.generator.resblocks.9.convs1.2.bias', 'vocoder.generator.resblocks.9.convs1.2.weight', 'vocoder.generator.resblocks.9.convs2.0.bias', 'vocoder.generator.resblocks.9.convs2.0.weight', 'vocoder.generator.resblocks.9.convs2.1.bias', 'vocoder.generator.resblocks.9.convs2.1.weight', 'vocoder.generator.resblocks.9.convs2.2.bias', 'vocoder.generator.resblocks.9.convs2.2.weight', 'vocoder.generator.resblocks.10.convs1.0.bias', 'vocoder.generator.resblocks.10.convs1.0.weight', 'vocoder.generator.resblocks.10.convs1.1.bias', 'vocoder.generator.resblocks.10.convs1.1.weight', 'vocoder.generator.resblocks.10.convs1.2.bias', 'vocoder.generator.resblocks.10.convs1.2.weight', 'vocoder.generator.resblocks.10.convs2.0.bias', 'vocoder.generator.resblocks.10.convs2.0.weight', 'vocoder.generator.resblocks.10.convs2.1.bias', 'vocoder.generator.resblocks.10.convs2.1.weight', 'vocoder.generator.resblocks.10.convs2.2.bias', 'vocoder.generator.resblocks.10.convs2.2.weight', 'vocoder.generator.resblocks.11.convs1.0.bias', 'vocoder.generator.resblocks.11.convs1.0.weight', 'vocoder.generator.resblocks.11.convs1.1.bias', 'vocoder.generator.resblocks.11.convs1.1.weight', 'vocoder.generator.resblocks.11.convs1.2.bias', 'vocoder.generator.resblocks.11.convs1.2.weight', 'vocoder.generator.resblocks.11.convs2.0.bias', 'vocoder.generator.resblocks.11.convs2.0.weight', 'vocoder.generator.resblocks.11.convs2.1.bias', 'vocoder.generator.resblocks.11.convs2.1.weight', 'vocoder.generator.resblocks.11.convs2.2.bias', 'vocoder.generator.resblocks.11.convs2.2.weight', 'vocoder.generator.conv_post.bias', 'vocoder.generator.conv_post.weight', 'vocoder.generator.spkr_linear.0.weight', 'vocoder.generator.spkr_linear.0.bias', 'vocoder.generator.spkr_linear.2.weight', 'vocoder.generator.spkr_linear.2.bias']) Removing weight norm... Removing weight norm... Inferencing batch 1, total 41 baches. Traceback (most recent call last): File "/home/miniconda3/envs/mqtts/lib/python3.9/runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "/home/miniconda3/envs/mqtts/lib/python3.9/runpy.py", line 87, in _run_code exec(code, run_globals) File "/home/.vscode/extensions/ms-python.python-2023.4.1/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/main.py", line 39, in cli.main() File "/home/.vscode/extensions/ms-python.python-2023.4.1/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 430, in main run() File "/home/.vscode/extensions/ms-python.python-2023.4.1/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 284, in run_file runpy.run_path(target, run_name="main") File "/home/.vscode/extensions/ms-python.python-2023.4.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 321, in run_path return _run_module_code(code, init_globals, run_name, File "/home/.vscode/extensions/ms-python.python-2023.4.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 135, in _run_module_code _run_code(code, mod_globals, init_globals, File "/home/.vscode/extensions/ms-python.python-2023.4.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 124, in _run_code exec(code, run_globals) File "infer.py", line 81, in synthetic = model(i_wavs, i_phones) File "/home/miniconda3/envs/mqtts/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, kwargs) File "/home/MQTTS/tester.py", line 71, in forward synthetic = self.TTSdecoder.inference_topkp_sampling_batch(phone_features, speaker_embedding, phone_masks, prior=prior) File "/home/MQTTS/modules/wildttstransformer.py", line 96, in inference_topkp_sampling_batch phone = self.encode_phone(phone, spkr, phone_mask) File "/home/MQTTS/modules/wildttstransformer.py", line 78, in encode_phone phone, enc_attn = self.encoder(phone, mask=None, attn_bias=phone_alibi, src_key_padding_mask=ex_phone_mask) File "/home/miniconda3/envs/mqtts/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, *kwargs) File "/home/MQTTS/modules/transformers.py", line 247, in forward output, attn = mod(output, src_mask=mask, attn_bias=attn_bias, src_key_padding_mask=src_key_padding_mask) File "/home/miniconda3/envs/mqtts/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(input, kwargs) File "/home/MQTTS/modules/transformers.py", line 228, in forward res, self_attn = self.self_attn(src, src, src, attn_mask=src_mask, attn_bias=attn_bias, File "/home/miniconda3/envs/mqtts/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/home/MQTTS/modules/transformers.py", line 88, in forward assert key_padding_mask.size() == (batch_size, k.size(1)), f"Should be {(batch_size, k.size(1))}. Got {key_padding_mask.size()}" AssertionError: Should be (2, 96). Got torch.Size([2, 88])

nivibilla commented 1 year ago

Hey, i found this issue in another model. Make sure your input reference audio is of sufficient length. thats what fixed it for me