OvidijusParsiunas / deep-chat

Fully customizable AI chatbot component for your website
https://deepchat.dev
MIT License
1.26k stars 170 forks source link

Events when audio is activated / deactivated #177

Open FranzHell opened 2 months ago

FranzHell commented 2 months ago

image

Dear @OvidijusParsiunas

Similar to the event shown in the screenshot, would it be possible to have an event fired when the audio component is started and when it's finished (click of audio icon, when black/red). i want to record the audio, but I dont know when, and the speech-to-text component does not give an option.

image

In the speech-to-element you have 2 options that would come in handy, onStart and onStop:

  window.SpeechToElement.toggle("azure", {
    region: "REGION",
    subscriptionKey: "SUBSCRIPTION-KEY",
    element: textEl,
    onStart: () => {
      buttonEl.innerText = "Stop";
    },
    onStop: () => {
      buttonEl.innerText = "Start";
    },
    onError: () => {
      errorEl.style.display = "block";
      buttonEl.innerText = "Start";
    }
  });

Would u be able the help?

Thanks!

OvidijusParsiunas commented 2 months ago

Hi @FranzHell.

I can potentially add this feature to our next release. However the problem is when you attempt to speak with speech to text being active, the browser has control over the microphone input, and what you may find is that the audio input for your recording can sometimes not work (as it is used by the browser). This happened to me a few times when I was on a zoom call, and the people stopped hearing me as I was using speech to text.

So implementing this feature may be redundant. I would first suggest that you see if you are able to record when speech to text is on to make sure that it 100% works for you. I would suggest doing multiple tests, e.g. start the recording - then turn speech to text on, try to toggle it, and perhaps try recording when speech to text is already on. As I mentioned, this bug was happening to me sometimes, hence you may need to test this for a bit.

If you are ultimately happy that it does work correctly, I will add the events into our API.

Thankyou.

FranzHell commented 2 months ago

The problem that I have with deep-chat, I dont have (or dont know about) any event that tells me when the mic is triggered to be active / speech2text has started. I am using speech-to-element at the moment where I have onStart and onStop methods available. there I can succesfully trigger the recording of the audio, everytime I tried. can also succesfully replay the audio and post it to an API endpoint, and listen to the stored file as well.

OvidijusParsiunas commented 1 month ago

Hi @OvidijusParsiunas. My apologies for the wait. I have added the events to the speechToText property. They will be available via an events property which has the following type:

onStart?: () => void;
onStop?: () => void;
onResult?: (text: string, isFinal: boolean) => void;
onPreResult?: (text: string, isFinal: boolean) => void;
onCommandModeTrigger?: (isStart: boolean) => void;
onPauseTrigger?: (isStart: boolean) => void;

You can use the events like this:

speechToText = {
  events: {
    onStart: () => {
      console.log('start');
    }
  }
}

This is currently only available in the deep-chat-dev and deep-chat-react-dev packages versions 9.0.170.

Let me know if these work for you.

FranzHell commented 1 month ago

i have two issues:

with

import deep-chat-dev;

I always get the demo response

image

even though i have it hooked to a custom API:

        <deep-chat
          speechToText='{

            "azure": {
              "subscriptionKey": "XXXXXXX",
              "region": "eastus",
              "language": "en-US",
              "stopAfterSilenceMs": 5000
            }
          }'
          :request="getRequest"
        />

With import deep-chat; I do get a custom response via my local API, when I do not import the dev package.

Can u imagine why?

And on another note:

  speechToText='{
    "webSpeech": true,
    "translations": {"hello": "goodbye", "Hello": "Goodbye"},
    "commands": {"resume": "resume", "settings": {"commandMode": "hello"}},
    "button": {"position": "outside-left"},
    "events": {
      "onStart": () => console.log("Speech recognition started"),
      "onStop": () => console.log("Speech recognition stopped"),
      "onResult": (text, isFinal) => console.log(`Result received: ${text}, Final: ${isFinal}`),
      "onPreResult": (text, isFinal) => console.log(`Preliminary result received: ${text}, Final: ${isFinal}`),
      "onCommandModeTrigger": (isStart) => console.log(`Command mode ${isStart ? "started" : "stopped"}`),
      "onPauseTrigger": (isStart) => console.log(`Pause ${isStart ? "initiated" : "ended"}`)
    }
  }'

would this be a proper integration of events, when u pass them as a string together with the other speech options? because i cant see any logs. could you help me out there?

OvidijusParsiunas commented 1 month ago

Hi @FranzHell.

Could you share what getRequest returns or how does that method operate.

In regards to the events API, everything checks out fine from my initial inspection. The documentation will be updated accordingly once it goes out to the main package.

FranzHell commented 1 month ago

Part of script of vue.js component

     const getRequest = computed(() => {
      if (!sessionId.value || !threadId.value) {
        return null; // This prevents trying to use it before being ready
      }
      return {
        url: "http://127.0.0.1:5000/drive-avatar",
        headers: {
          'X-CSRFToken': getCSRFToken(),
        },
        additionalBodyProps: {
          assistant_id: props.assistant,
          session_id: sessionId.value,
          thread_id: threadId.value
        }
      };
    });

I am using additional body props, works fine with the deep-chat package, getting a response image

OvidijusParsiunas commented 1 month ago

Ok, I did a bit of experimentation around this and I believe using request is not working for you is because the new Deep Chat API does not register it as an actual property and does not listen to its changes, hence the values being returned by getRequest do not do anything. I guess this will be one non-backwards compatible change that I will have to take a note of, but instead I would recommend you to replace :request with the new :connect property and everything should work. To note, the stream is also moved to connect if you are using it. The documentation for the full API changes will be updated when the new Deep Chat version is released.