MayamaTakeshi / sip-lab

A node module that helps to write SIP functional tests
3 stars 2 forks source link

Consider creating a dual-tone voice #92

Open MayamaTakeshi opened 3 months ago

MayamaTakeshi commented 3 months ago

In my tests I have been using dtmf as a voice to be detected by speech recog system that is able to understand dtmf 'voice'. However, this conflicts with cases where the speech recog system also supports DTMF detection. So as an alternative to this, I am implementing speech synth and recog using flite and pocketsphinx. However, when sampling rate is 8000hz (pcmu/pcma PSTN), pocketsphinx will not perform well. And even when using speex/16000hz, although pocketsphinx shows better results it would still require adjustments with trial and error. So as another alternative, I am planning to implement support for access to external speech synth/recog via a websocket server that would proxy requests to google speech, amazon poly, openai whisper (that can be executed locally) etc. But still, we might prefer a non-voice recog approach. For this what we can try is to extend the DTMF tones and create a new "dual-tone" voice. This way, there will be on conflict.

MayamaTakeshi commented 3 months ago

https://www.montana.edu/rmaher/eele477_sp18/EELE_477_Lab_09.pdf

https://chat.openai.com/share/84e6ce44-5ab4-4cb6-8072-cc9f22676e74

So let's try patching https://github.com/freeswitch/spandsp/blob/master/src/dtmf.c and see what happens.