NVIDIA / flowtron

Flowtron is an auto-regressive flow-based generative network for text to speech synthesis with control over speech variation and style transfer
https://nv-adlr.github.io/Flowtron
Apache License 2.0
889 stars 177 forks source link

Too many values to unpack error in 'Collect z values' section of the style transfer notebook #96

Closed samialsindi closed 3 years ago

samialsindi commented 3 years ago

Hi, thanks for making this code available. I've been able to get inferencing to work in the inference.py but not the style transfer notebook.

Specifically, I get this (using your supplied 'surprised' examples):

ValueError Traceback (most recent call last)

in 1 force_speaker_id = 0 2 for i in range(len(dataset)): ----> 3 mel, sid, text = dataset[i] 4 mel, sid, text = mel[None].cuda(), sid.cuda(), text[None].cuda() 5 if force_speaker_id > -1: ValueError: too many values to unpack (expected 3) I tried to fix this by making line 3 iterative, but this causes worse problems later. Your data class looks very powerful but is also difficult to reverse engineer why this is happening. Thanks in advance
rafaelvalle commented 3 years ago

Try substituting line 3 with the line below and let us know if you see any other issues. mel, sid, text, attn_prior = dataset[i]

samialsindi commented 3 years ago

Thanks for the quick response I did this and now I get this error:


TypeError Traceback (most recent call last)

in 8 in_lens = torch.LongTensor([text.shape[1]]).cuda() 9 with torch.no_grad(): ---> 10 z = model(mel, sid, text, in_lens, None)[0] 11 z_values.append(z.permute(1, 2, 0)) ~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs) 530 result = self._slow_forward(*input, **kwargs) 531 else: --> 532 result = self.forward(*input, **kwargs) 533 for hook in self._forward_hooks.values(): 534 hook_result = hook(self, input, result) ~/flowtron/flowtron.py in forward(self, mel, speaker_ids, text, in_lens, out_lens, attn_prior) 700 for i, flow in enumerate(self.flows): 701 mel, log_s, gate, attn = flow( --> 702 mel, encoder_outputs, mask, out_lens, attn_prior) 703 log_s_list.append(log_s) 704 attns_list.append(attn) ~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs) 530 result = self._slow_forward(*input, **kwargs) 531 else: --> 532 result = self.forward(*input, **kwargs) 533 for hook in self._forward_hooks.values(): 534 hook_result = hook(self, input, result) ~/flowtron/flowtron.py in forward(self, mel, text, mask, out_lens, attn_prior) 446 # backwards flow, send padded zeros back to end 447 for k in range(mel.size(1)): --> 448 mel[:, k] = mel[:, k].roll(out_lens[k].item(), dims=0) 449 if attn_prior is not None: 450 attn_prior[k] = attn_prior[k].roll(out_lens[k].item(), dims=0) TypeError: 'NoneType' object is not subscriptable
rafaelvalle commented 3 years ago

We need to pass out_lens. Check which dimension of mel is different from 1 and 80, should be dimension 2. Assuming it's dimension 2, the code below should work: Let us know if you see other issues.

out_lens = torch.LongTensor([mel.shape[2]]).cuda()
z = model(mel, sid, text, in_lens, out_lens)[0]
samialsindi commented 3 years ago

These two changes you suggested have fixed the issue completely, thank you for the quick response and excellent help