edimuj / app-audioinput-demo

A Cordova app for demonstrating the microphone capture capabilities of cordova-plugin-audioinput.
https://github.com/edimuj/cordova-plugin-audioinput
MIT License
12 stars 11 forks source link

[Feature Req] input volume based audio capture stop for voice rec #1

Open olivermuc opened 5 years ago

olivermuc commented 5 years ago

Hi Edin,

First off, awesome contribution to the Cordova world of plugins & sample code here.

A key feature for voice recognition and probably other use cases would be volume based stopping of audio capture to automatically kick off further processing/NLP.

How would I need to go about this, using your cordova plugin & given the sample code for events here?

...or even better yet, is your (older) speech rec project still available / compatible to the current version of your plugin?

Thanks! Oliver

PS: Commodore64 was a fantastic way to get started - same here :)

edimuj commented 5 years ago

Thank you, Oliver!

I agree, being able to get audio chunks of speech is very useful in many scenarios, which is the reason why I created the project you mention. As I mentioned in the other issue, that project has since then been purchased by a company for commercial use, so it is sadly not available as open-source any longer, but a man needs to eat 😀

The audioinput plugin can be used in many ways, so that type of feature is not something that will be implemented in the plugin, since I believe that Cordova plugins should focus on doing one thing, and doing that well. Basically keeping it clean and simple.

But with that said, it is of course still possible to implement similar functionality outside the audioinput plugin, using for example Web Audio by continuously analyzing the captured audio and creating chunks of audio above a certain audio level.

C64 was indeed a really an amazing machine. Later on, after transitioning to C128, I started using Amiga and only reluctantly had to give that up in pretty late, in 1999. But still we can both be proud of being part of the Generation C64 ☺

Are you on LinkedIn?

olivermuc commented 5 years ago

Yes, Edin, I'm also on linkedIn :-) ..again apologies for the level of persistence, time is currently not on my side. But I think I managed to find a usable implementation of volume based silence recognition. Probably neither fully accurate nor elegant, but it will do. I fully appreciate the license constraint, and equally agree that one has got to make a living! I think you were quite ahead of the major voice processing/rec players that now refocus on the topic in combination with NLP and other cool tech.

Re: where such a "volume"-feature could or should sit, I also agree with your statement, with the only exception that especially in hybrid apps performance is always an issue and if some of the background processing can be kept at a stack lower level, I wouldn't say no :)

PS: The C64 got me straight to the Atari ST corner - yep, the other "side", lol. Those were all great times, including Assembly language, floppy discs and what not.