axinc-ai / ailia-models

The collection of pre-trained, state-of-the-art AI models for ailia SDK
1.98k stars 317 forks source link

ADD GPT SOVITS V2 #1522

Closed kyakuno closed 1 day ago

kyakuno commented 1 month ago

https://github.com/RVC-Boss/GPT-SoVITS

prettypanghu commented 3 weeks ago

https://github.com/RVC-Boss/GPT-SoVITS/wiki/

kyakuno commented 3 weeks ago

GPT-SOVITSのV1をONNXに変換した際に書き換えたコードは下記。 https://github.com/RVC-Boss/GPT-SoVITS/pull/835

kyakuno commented 3 weeks ago

@ooe1123 昔のPRの整理の後に、こちらを検討いただけると嬉しいです

ooe1123 commented 2 days ago

Export speed parameter

〇 GPT_SoVITS\module\models_onnx.py

class TextEncoder(nn.Module):
    ...
    def forward(self, y, text, ge):
        ...
        y = self.encoder2(y * y_mask, y_mask)

class TextEncoder(nn.Module):
    ...
    def forward(self, y, text, ge, speed):
        ...
        y = self.encoder2(y * y_mask, y_mask)

        if torch.onnx.is_in_onnx_export():
            a = torch.cat([
                ((y.shape[-1] / speed).long()+1).unsqueeze(0),
                y.shape[-1].unsqueeze(0)
            ], dim=0)
            size = a[(speed == torch.tensor(1.0)).long()]
            y = F.interpolate(y, size=size, mode="linear")
            y_mask = F.interpolate(y_mask, size=y.shape[-1], mode="nearest")
        else:
            if(speed!=1):
                y = F.interpolate(y, size=int(y.shape[-1] / speed)+1, mode="linear")
                y_mask = F.interpolate(y_mask, size=y.shape[-1], mode="nearest")