cyrenity / mod_whisper

A FreeSWITCH module to interface to your speech recognition server over websocket
25 stars 13 forks source link

make error #13

Open jneuendorf-i4h opened 11 months ago

jneuendorf-i4h commented 11 months ago

Hey again 😅

I think I got the build setup working (see #12). When running make I get the following error (FreeSwitch 1.10, Debian 12):

making all mod_whisper
make[4]: Entering directory '/usr/src/freeswitch/src/mod/asr_tts/mod_whisper'
  CC       mod_whisper_la-mod_whisper.lo
  CC       mod_whisper_la-websock_glue.lo
websock_glue.c: In function 'ws_asr_thread_run':
websock_glue.c:301:53: error: 'n' is used uninitialized [-Werror=uninitialized]
  301 |         while (context->started == WS_STATE_STARTED && n >= 0) {
      |                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~
websock_glue.c:300:13: note: 'n' was declared here
  300 |         int n;
      |             ^
cc1: all warnings being treated as errors
make[4]: *** [Makefile:751: mod_whisper_la-websock_glue.lo] Error 1
make[4]: Leaving directory '/usr/src/freeswitch/src/mod/asr_tts/mod_whisper'

What would be a reasonable initial value of n?

Setting int n = 1; would work in case the context is never empty when context->started == WS_STATE_STARTED. 🤷‍♂️

From this example it seems that n represents the payload size (length of context->result_text?).

cyrenity commented 11 months ago

Didn't get this error on our server, may be you can try completely removing the n or initialize it with a non-zero value i..e int n=1;

jneuendorf-i4h commented 11 months ago

Didn't get this error on our server, may be you can try completely removing the n or initialize it with a non-zero value i..e int n=1;

Thanks for the quick reply. Initializing n = 1 worked. 👍


Btw, was this module designed with a particular Whisper websocket server in mind? I couldn't find one that fits this module out of the box. If not, could you clarify the protocol (what message types are there and in which order are they sent?)

Thank you 🙏 😊

cyrenity commented 11 months ago

It's not written for any particular websocket server, you can simply send audio bytes to a websocket listener and return the output as literal json string, for TTS you send a json string and return audio bytes from the websocket server