uberduck-ai / uberduck-ml-dev

ML models for Uberduck
Apache License 2.0
377 stars 61 forks source link

Non-Attentive fixes #115

Closed johnpaulbin closed 1 year ago

johnpaulbin commented 1 year ago

Add inference_noattention() to models/tacotron2 for easier inferencing.

Fixes get_alignment() by allowing GSTs

johnpaulbin commented 1 year ago

Example usage code:

original_target_mel = output["mel_outputs_postnet"]

if not cpu_run:
    original_target_mel = original_target_mel.cuda()

speaker_ids = torch.LongTensor([0]).cuda()

inputs = (
    text_padded,
    input_lengths,
    original_target_mel,
    torch.LongTensor([0]).cuda(),
    torch.LongTensor([0]).cuda(),
    speaker_ids,
    embedding
)

attn = taco.get_alignment(inputs)

noattention_output = taco.inference_noattention(
        text_padded, input_lengths, speaker_ids, embedding, attn.transpose(0, 1)
    )
y_g_hat = hifigan.vocoder.forward(torch.tensor(noattention_output["mel_outputs_postnet"], dtype=torch.float, device=device))

audio = y_g_hat.reshape(1, -1)
audio = audio * 32768.0

Audio(audio.cpu().detach().numpy(), rate=22050)
sjkoelle commented 1 year ago

looks good to me