mush42 / sonata-nvda

This add-on implements a speech synthesizer driver for NVDA using neural TTS models. It supports Piper
GNU General Public License v2.0
37 stars 8 forks source link

Enabling GPU acceleration #10

Open roshkins opened 1 year ago

roshkins commented 1 year ago

I read on the Piper readme that it supports GPU acceleration. I dug through the addon locally, but there doesn't seem a way to enable it easily, since it's all compiled rust. Any idea on how to get GPU acceleration working so it doesn't eat up all my CPU?

vortex1024 commented 1 year ago

it is quite complicated. It can't use cuda in his current form, since vnda is 32 bit and cuda is 64 bit only. I managed to enable direct ml, which is another programatic way of using the GPU, but something is wrong since it frequently crashes with tno error, and it does not react faster than the CPU version.

mush42 commented 1 year ago

I can confirm the issue with DirectML, which seams to be the best option for Windows since it is not limited to Cuda.

Still I've some hope to get DirectML working once we upgrade to ONNXRuntime v1.15.

ONNXRuntime supports Intel-specific inference accelerators, such as one-DNN and open-Veno. But they require building ONNXRuntime from source, and I don't know their impact on speed. I'll investigate them when I have some free time.

Also, WindowsML, which is different from DirectML, is under consideration, since it is the native ML platform built into Windows 10 and later.

Best

roshkins commented 1 year ago

Is it possible to launch a new process in 64-bit that uses IPC to transmit data? Or is that the complication that you're talking about @vortex1024 ?

mush42 commented 1 year ago

Actually ONNXRuntime on 32-bit is 2x slower than it is on 64-bit systems.

The question is: what IPC approach to use? We need an approach with a very low overhead, we talk about milliseconds here.

vortex1024 commented 12 months ago

@mush42 sockets with protobuf?

beqabeqa473 commented 11 months ago

are stdin/stdout slow on windows? what do you think? or maybe COM?