TTS is using legacy speech output on Windows 11

Snowman-25 commented 2 months ago

Before submitting your bug report

[X] I believe this is a bug. I'll try to join the Continue Discord for questions
[X] I'm not able to find an open issue that reports the same bug
[X] I've seen the troubleshooting guide on the Continue Docs

Relevant environment info

- OS: Windows 11 23H2 (Build 22631.4169)
- Continue:   v0.8.52
- IDE:        VSCode 1.92.1 (user setup)
- Model:      n/a
- config.json:

{
  "models": [
    {
      "title": "Ollama",
      "provider": "ollama",
      "model": "AUTODETECT"
    }
  ],
  "customCommands": [
    {
      "name": "test",
      "prompt": "{{{ input }}}\n\nWrite a comprehensive set of unit tests for the selected code. It should setup, run tests that check for correctness including important edge cases, and teardown. Ensure that the tests are complete and sophisticated. Give the tests just as chat output, don't edit any file.",
      "description": "Write unit tests for highlighted code"
    },
    {
      "name": "desc",
      "prompt": "{{{ input }}}\n\nWrite a comprehensive comment for each block of the selected code. It should describe what it does and show possible caveats.",
      "description": "Comment the highlighted code"
    }
  ],
  "tabAutocompleteModel": {
    "title": "Starcoder 3b",
    "provider": "ollama",
    "model": "starcoder2"
  },
  "allowAnonymousTelemetry": false,
  "embeddingsProvider": {
    "provider": "transformers.js"
  },
  "contextProviders": [
    {
      "name": "open",
      "params": {
        "onlyPinned": false
      }
     }
  ],
  "experimental": {
    "readResponseTTS": true
  },
  "ui": {
    "showChatScrollbar": true
  },
  "docs": []
}

Description

After activating TTS I noticed that I couldn't change the voice and speed of the output in my Windows Settings.

After some searching around I found the legacy Text-to-Speech settings in the old control panel and found that this is the TTS-Settings that Continue uses. They sound very machine-like and might even pronounce everything wrong if set to a wrong language. I'd much rather use the "Natural voice" TTS-settings that Windows 11 provides.

To reproduce

Active TTS by adding readResponseTTS: true to your config.json Enter any prompt in the chat window

Log output

No response

Patrick-Erichsen commented 2 months ago

This is good to know, thanks for the heads up here @Snowman-25 ! Appreciate this and the other issues you've raised.

TTS was a community contributed feature so I'm not very familiar with the intricacies of how it's configured. Mind linking the Continue code you were looking at? Also, any idea if it's straightforward to update the settings to use the "Natural voice"?

Snowman-25 commented 2 months ago

Hi,

I wasn't looking at any code specifically, but you referenced this handler in another issue:

https://github.com/continuedev/continue/blob/0cfcc599de1a54ae0e9c1bbcb09c5f3d0e03a126/core/util/tts.ts#L67-L72 Quick note here: PowerShell 7 / .NET 3.1 deprecated the System.Speech.Synthesis.SpeechSynthesizer API in favour of SAPI.SpVoice

According to https://stackoverflow.com/questions/77443751/how-to-access-newly-added-natural-voices-in-powershell-after-windows-11-update there's apparently 3 different Voice-APIs and the way it's currently done is accessing the old SAPI-Voices.

According to the above link, the "Natural Voice" voices can't be used via an API. That leaves us with the "One Core"-Voices. Unfortunately, the example given in the StackOverflow-Question for One Core voices doesn't work on PowerShell Core. Which isn't a problem per se, since even Windows 11 default powershell is the Desktop Edition 5.x

Since the One Core-Voices sound really similar to the SAPI-Voices, there's no immediate need to change anything.

continuedev / continue