SillyTavern / Extension-Speech-Recognition

Convert your speech to text using browser or extras.
GNU General Public License v3.0
4 stars 12 forks source link

Broken, sent in WAV files seem to be completely empty. #2

Open outNEXT opened 8 months ago

outNEXT commented 8 months ago

The generated wav files that are created when recording in the browser are always blank. I have confirmed that my microphone works in other browser applications and websites.

Cohee1207 commented 8 months ago

Provide some details like your browser, STT backend, just anything.

outNEXT commented 8 months ago

Provide some details like your browser, STT backend, just anything.

I'm on google chrome, SillyTavern is 1.11.3 and I also tried 1.11.4 (staging)

Velour-Fog commented 5 months ago

I can confirm that I also have this issue. I have narrowed it down to being a chrome specific issue, as it works fine in Firefox.

I have installed this extension via the download extension feature, and enabled --enable-modules=whisper-stt via the SillyTavern-extras server. I set it to "Whisper (Extras)" in the Speech Recognition settings, and when I hit record and talk, the "stt_test.wav" file has no audio recorded, but is the correct length. When I switch over to Firefox, it works totally fine.

Somehow the audio isn't being picked up in chrome for me despite me checking allow microphone when I am on the SIllyTavern UI. I even tried it after disabling all chrome browser extensions. Opening a new tab and going to a microphone test website shows that its definitely picking up the microphone.

PS- It also works fine when I use "Browser" under "Select Speech-to-text Provider". So its able to record audio via that, just not via "Whisper (Extras)".

SolvAI commented 4 months ago
  +1

  additionnal informations (when trying to use whisper which apparently calls transformers.js)

  ...
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799,
  -0.7977839112281799
],
size: 240000

} } Error: failed to call OrtRun(). error code = 6. at e.run (C:\SillyTavern\node_modules\onnxruntime-web\dist\ort-web.node.js:6:454392) at e.run (C:\SillyTavern\node_modules\onnxruntime-web\dist\ort-web.node.js:6:443740) at e.OnnxruntimeWebAssemblySessionHandler.run (C:\SillyTavern\node_modules\onnxruntime-web\dist\ort-web.node.js:6:446675) at m.run (C:\SillyTavern\node_modules\onnxruntime-common\dist\ort-common.node.js:6:9992) at sessionRun (file:///C:/SillyTavern/node_modules/sillytavern-transformers/src/models.js:207:36) at encoderForward (file:///C:/SillyTavern/node_modules/sillytavern-transformers/src/models.js:520:18) at Function.seq2seqForward [as _forward] (file:///C:/SillyTavern/node_modules/sillytavern-transformers/src/models.js:361:34) at Function.forward (file:///C:/SillyTavern/node_modules/sillytavern-transformers/src/models.js:820:27) at Function.seq2seqRunBeam [as _runBeam] (file:///C:/SillyTavern/node_modules/sillytavern-transformers/src/models.js:480:29) at Function.runBeam (file:///C:/SillyTavern/node_modules/sillytavern-transformers/src/models.js:1373:27)