yl4579 / StyleTTS2

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
MIT License
4.97k stars 419 forks source link

First stage training after 49th epoch (i.e., when epoch >= TMA_epoch) #259

Open SandyPanda-MLDL opened 5 months ago

SandyPanda-MLDL commented 5 months ago

I am continuously getting this error in "train_first.py", line 331, in main g_loss.requires_grad = True RuntimeError: you can only change requires_grad flags of leaf variables.

if epoch >= TMA_epoch: # start TMA training loss_s2s = 0 for _s2s_pred, _text_input, _text_length in zip(s2s_pred, texts, input_lengths): loss_s2s += F.cross_entropy(_s2s_pred[:_text_length], _text_input[:_text_length]) loss_s2s /= texts.size(0)

            loss_mono = F.l1_loss(s2s_attn, s2s_attn_mono) * 10

            loss_gen_all = gl(wav.detach().unsqueeze(1).float(), y_rec).mean()
            print(f'the shape of both wav and y_rec respectively {wav.shape} and {y_rec.shape}')
            loss_slm = wl(wav.detach(),  y_rec.squeeze(1)).mean()

            g_loss = loss_params.lambda_mel * loss_mel + \
            loss_params.lambda_mono * loss_mono + \
            loss_params.lambda_s2s * loss_s2s + \
            loss_params.lambda_gen * loss_gen_all + \
            loss_params.lambda_slm * loss_slm
            print(f'Generator loss is {g_loss}')

        else:
            loss_s2s = 0
            loss_mono = 0
            loss_gen_all = 0
            loss_slm = 0
            g_loss = loss_mel
            print(f'else Generator loss is {g_loss}')

        running_loss += accelerator.gather(loss_mel).mean().item()
        #print(f"g-loss is {type(g_loss)}")
        optimizer.zero_grad()
        g_loss.requires_grad = True
        accelerator.backward(g_loss)
        # g_loss.requires_grad = True
        # g_loss.backward()