Picovoice / picovoice

On-device voice assistant platform powered by deep learning
Apache License 2.0
561 stars 109 forks source link

Picovoice Documentation Issue: is there a discord server? #840

Closed tval2 closed 1 month ago

tval2 commented 1 month ago

What is the URL of the doc?

https://picovoice.ai/blog/javascript-voice-activity-detection/

What is the nature of the issue?

Right now I'm just trying to understand the differences between @picovoice/web-voice-processor @picovoice/cobra-web and @picovoice/cobra-node. Im working on a Next.js-style project and obviously don't want to expose my API key to the front end but I'm also streaming in live mic audio that I'd like to detect if someone is speaking in real time (basically have a green light flash when someone is speaking and a red light otherwise).

Figured Discord would be the best medium to discuss stuff like this quickly but I didn't see any way to interact with the picovoice community unfortunately

albho commented 1 month ago

@tval2 - Sorry, we don't have a Discord server, so GitHub issues are currently the best way to discuss issues.

To answer your question: web-voice-processor converts the microphone sampling rate to 16kHz and passes the audio to web engines such as cobra-web, which then takes the audio and returns a score between 0-1 that represents how likely each frame of audio contains human voice (vs. ambient noise, for example). cobra-node works in the same way but with node so it cannot be used directly with web-voice-processor - instead, it can be used with PvRecorder.

So unfortunately we don't have a good solution for your problem, since you would have to use cobra-web to achieve voice activity detection in the browser but would require your API key to be stored in the frontend, whereas using cobra-node in your case would require sending audio from the frontend to the backend.

Hopefully this at least clarifies how each of those packages are intended to work.

tval2 commented 1 month ago

@albho thanks for getting back! I'm still a bit lost, however.

Your answer above made it seem as if there's no way to build an app with picovoice that doesn't expose the API key. But I can't imagine that's the case as no developer wants to expose API keys.

Unless your saying that I should just stick with cobra-node and I should be fine; is that what you were implying? I figured I would just import cobra-node on the backend in a route.ts file and then stream audio chunks to it via a serverless fetch call from the frontend.

Let me know if i'm misinterpreting something here.

albho commented 1 month ago

Yes - to avoid putting your API key in the frontend, you can use cobra-node instead and stream audio to it from the frontend.

tval2 commented 1 month ago

@albho Okay great - that's what I thought. Just wanted to make sure I was using the right one given the two different tutorials in the docs. I had a different error I ran into earlier today but I put that on the Cobra repo since that's probably the more appropriate spot in case some else has the same issues.

https://github.com/Picovoice/cobra/issues/220