[Bug] Memory Explosion with xtts HifiganGenerator

Describe the bug

When running xttsv2 on 3090 RTX on WSL2 Ubuntu 22.04 on Windows 11 I would intermittently get memory explosions when doing inference. It seems to happen when I have huggin face transformer LLM loaded at the same time as XTTS. I traced when it happens to the forward pass of HifiganGenerator when it runs o = self.conv_pre(x) because self.conv_pre is just weight_norm(Conv1d(in_channels, upsample_initial_channel, 7, 1, padding=3) I couldn't identify any further what was going on but for some reason calling this uses all avilable gpu memory. Prior to hitting this line the system is using 8GB of VRAM then as soon as it hits it it goes to 23.7+GB of VRAM then the system starts to freeze.

Any help would be awesome but it is a weird bug.

To Reproduce

I'm not able to produce on any of the leased machines I have. This just happens on my 3090 RTX, but the steps seem to be on

Load XTTS Model Load Hugging Face LLM

Run inference via inference_stream

Expected behavior

Memory pressure may fluctuate a bit but not 16+GB worth of fluxuation

Logs

No response

Environment

Windows 11
WSL2 Ubuntu 22.04
Tried on multiple version of python and pytorch and multiple versions of cuda

Reproduced on 11.8 12.2 releases of pytorch

Additional context

No response

coqui-ai / TTS