ricky0123 / vad

Voice activity detector (VAD) for the browser with a simple API
https://www.vad.ricky0123.com
Other
901 stars 143 forks source link

-Added changes to real-time-vad to #142

Closed johncrawley closed 1 month ago

johncrawley commented 1 month ago

-Added new test in the test site -Updated the changelog -Updated the package version

Description of changes

The solution to get vad-web working on a Firefox extension involves allowing the AudioWorkletNode constructor's third parameter (AudioWorkletNodeOptions) to be modifiable via the MicVAD constructor. The Firefox client can pass in an empty object and the DataCloneError will not get thrown. (Note that this requires the use of a custom worklet bundle with a hardcoded frameSamples value when instantiating the Resampler)

In real-time-vad.ts, the Firefox extension environment doesn't recognize the frame passed in to AudioNodeVade.processFrame() as a Float32Array and the onnx-runtime fails. This is solved with a change to real-time-vad.ts that recreates the ArrayBuffer at vadnode.port.onmessage (let buffer = ev.data.data;) if it is found to not be an instance of ArrayBuffer.

In the Firefox extension environment, the Float32Array built from the recreated ArrayBuffer is then recognized as a valid Float32Array by the instanceof operator and onnx-runtime operations proceed.

I've also added a test to the test site that confirms that an empty worklet options object can be passed into the MicVAD constructor, and that everything works fine, if an appropriately modified worklet bundle is also specified.

The changes have also been tested automated tests, and by manual tests running on Firefox nightly builds for Android and Linux.

Checklist

ricky0123 commented 1 month ago

Thanks a lot for your contribution @johncrawley Could you clarify how vad.worklet.bundle.min.custom.js was generated?

johncrawley commented 1 month ago

Ricky, thanks very much for the merge. I edited the original vad.worklet.bundle.min.js directly to include the hard-coded 'targetFrameSize' value. From now on, for my own development projects, I'll be using a fork of vad to generate the custom file.