Feature Request better speech input / push to talk

FalconOscuro / Helldivers-Voice-Stratagem

6 stars 1 forks source link

Feature Request better speech input / push to talk #2

Open senposage opened 8 months ago

senposage commented 8 months ago

Currently the voice regionition liberiry leaves a lot to be desired it doesn't deal with noisy microphones or acoustic echoes very well at all resulting in many mis-reads or failures PocketSphinx is very very outdated and prone to making errors even when the recording envoriment is pefect

maby something like this py lib https://pypi.org/project/SpeechRecognition/ OpenAI Wisper and Vosk can work offline if that is not a option detection MAY be somewhat improved by only listening when the stratagem key is held currently the detection seems to run off on its own and not hang up when you are done speaking

FalconOscuro commented 8 months ago

Currently it is setup to only listen whilst the stratagem key is held, it not hanging up may be due to microphone noise.

As for changing the voice recognition library, I am currently porting the entire program over to c++, and am planning on using the inbuilt Windows voice recognition functionality.

senposage commented 8 months ago

coolbeans You will likely want to use Interception, for the input libary then gameguard likes to inject its self into every process on the system preventing most programs from sending macros https://github.com/oblitum/Interception

I had forked your repo and started the process of porting to PYincerption but haven't had time to finish it

as for the speechapi SAPI is depcerated and nearly as horrible as pocketspinx especially if your disablity causes you to slur or even if your microphone does a poor job of filtering noise.

FalconOscuro commented 8 months ago

I don't want to meddle with Interception as if Arrowhead/gameguard change tact, they may not look to favourably on people using this, I am instead opting to just directly interface with the WindowsAPI, which currently seems to work. Not to stop you ofc, feel free to fork.

As for speech api I'm going to focus on just getting this up an running again first, before doing more in depth research.