neonbjb / tortoise-tts

A multi-voice TTS system trained with an emphasis on quality
Apache License 2.0
12.6k stars 1.76k forks source link

Preset breaks on 3090, works on 3080 #113

Open planetrocke opened 2 years ago

planetrocke commented 2 years ago

Why would these presets break on one card and not another?

'rough': {'num_autoregressive_samples': 4, 'diffusion_iterations': 12},

'rough': {'num_autoregressive_samples': 12, 'diffusion_iterations': 16},

On a 3080, it runs fine, but on a 3090 it does this:


Generating autoregressive samples..
0it [00:00, ?it/s]
Computing best candidates using CLVP
0it [00:00, ?it/s]
Traceback (most recent call last):
  File "read.py", line 65, in <module>
    gen = tts.tts_with_preset(text, voice_samples=voice_samples, conditioning_latents=conditioning_latents,
  File "/root/tortoise_good/tortoise/api.py", line 325, in tts_with_preset
    return self.tts(text, **settings)
  File "/root/tortoise_good/tortoise/api.py", line 449, in tts
    clip_results = torch.cat(clip_results, dim=0)
RuntimeError: torch.cat(): expected a non-empty list of Tensors```
neonbjb commented 2 years ago

There's a bug in the logic that feeds the CLVP model where num_autoregressive_samples<self.autoregressive_batch_size. I believe the fix is to add a line: batch_size = min(num_autoregressive_samples, self.autoregressive_batch_size) here: https://github.com/neonbjb/tortoise-tts/blob/main/tortoise/api.py#L404, then use batch_size in lieu of self.autoregressive_batch_size throughout. I can't test this right now, though.

planetrocke commented 2 years ago

There's a bug in the logic that feeds the CLVP model where num_autoregressive_samples<self.autoregressive_batch_size. I believe the fix is to add a line: batch_size = min(num_autoregressive_samples, self.autoregressive_batch_size) here: https://github.com/neonbjb/tortoise-tts/blob/main/tortoise/api.py#L404, then use batch_size in lieu of self.autoregressive_batch_size throughout. I can't test this right now, though.

OK, I will try this out, thanks. Could this also be why my 3090 system completely crashes while doing read.py (after a random amount of lines, even if I put a sleep in between inferences), while my 3080 system just keeps going?

neonbjb commented 2 years ago

No, i dont think so. That sounds like a hardware issue. What kind of PSU are you using? 3090s have spike loads up to 550w (something i encountered and had to work around while training this...)

planetrocke commented 2 years ago

It likely is. While the power doesn’t reach nearly the limit of the PSU, I have read somewhere where you need a platinum or titanium grade PSU do deal with the spikes. I’ve done several other types of training and inference and haven’t encountered this issue, so I am curious as to how you worked around the problem. On Jun 21, 2022, at 10:24 AM, James Betker @.***> wrote: No, i dont think so. That sounds like a hardware issue. What kind of PSU are you using? 3090s have spike loads up to 550w (something i encountered and had to work around while training this...)

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.***>

neonbjb commented 2 years ago

In my case i was operating 4 3090s on a single 1600w psu and had to migrate to a setup with 3 3090s per psu. The main system psu (which powers cpu and accessories) only powers 2 gpus.

I also had a 1300w psu which tripped overcurrrent with 3 3090s. I sold that one. :)

All my psus are evga golds so i dont think you need to chase the ratings - golds do work.

planetrocke commented 2 years ago

This happened with a single 3090 and a 1600W PSU. No crash reporting or logs of any kind, just suddenly crashed and rebooted.

On Tue, Jun 21, 2022 at 11:39 AM James Betker @.***> wrote:

In my case i was operating 4 3090s on a single 1600w psu and had to migrate to a setup with 3 3090s per psu. The main system psu (which powers cpu and accessories) only powers 2 gpus.

I also had a 1300w psu which tripped overcurrrent with 3 3090s. I sold that one. :)

All my psus are evga golds so i dont think you need to chase the ratings - golds do work.

— Reply to this email directly, view it on GitHub https://github.com/neonbjb/tortoise-tts/issues/113#issuecomment-1161929302, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEE4IKXKOJSC52BKZMGWDSLVQHO4HANCNFSM5ZKRWFFA . You are receiving this because you authored the thread.Message ID: @.***>

-- Scott Rocke Principal PlanetRocke Studios 937-369-7047 @.***

neonbjb commented 2 years ago

Ah probably not the psu then. I still think it is system related. Its interesting tortoise triggers it but not other workloads. The only thing unique that tortoise might be doing that other programs dont is computing ffts.

planetrocke commented 2 years ago

I'm new to this part of it... how do you think the Fourier analysis would affect the system specifically? The only other thing unique about this is it is using an Intel CPU, whereas my 3080 is an AMD. No idea if that helps, but just FYI.

neonbjb commented 2 years ago

Different types of CUDA operations stress the hardware in different ways. Most ML programs just do matrix multiplies and convolutions so it is possible the reason your system is crashing with Tortoise and not others is that Tortoise uses Fourier analysis.

This is just me talking out loud. It doesn't really help you. :( I'm not really sure what to suggest.

chrisbward commented 1 year ago

https://github.com/neonbjb/tortoise-tts/issues/189

chrisbward commented 1 year ago

@planetrocke Mine is also crashing, 3090ti, Intel (11th Gen Intel i9-11900K (16) @ 5.100GHz)

neonbjb commented 1 year ago

Hey guys, sorry this is happening to you but I'm quite confident it isn't tortoise at fault. I developed this on a windows machine with a 3090 and regularly use it on that machine. I'd look at drivers, Cuda installation and python environment first, then look at firmware and system specs.

planetrocke commented 1 year ago

For me it literally only happens when I set it to a faster setup (less passes)On Nov 19, 2022, at 4:02 PM, James Betker @.***> wrote: Hey guys, sorry this is happening to you but I'm quite confident it isn't tortoise at fault. I developed this on a windows machine with a 3090 and regularly use it on that machine. I'd look at drivers, Cuda installation and python environment first, then look at firmware and system specs.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>

MV799 commented 1 year ago

Running it on 3090 on windows with no issues

chrisbward commented 1 year ago

For me it literally only happens when I set it to a faster setup (less passes)On Nov 19, 2022, at 4:02 PM, James Betker @.> wrote: Hey guys, sorry this is happening to you but I'm quite confident it isn't tortoise at fault. I developed this on a windows machine with a 3090 and regularly use it on that machine. I'd look at drivers, Cuda installation and python environment first, then look at firmware and system specs. —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.>

So to confirm it works fine with no crashes when doing more passes?

I missed this in the docs... one of the flags?