Open ThEditor opened 2 weeks ago
@ThEditor I am developing an automatic speech recognition function that requires node to judge vads to provide different paragraphs for translation. Can you tell me how to use it? Thank you very much.
@MhandsomeM The README.md
of the fork shows how to use it, though the fork is not available as an npm
package. Lemme know if I should do that, until then you can copy over RealTimeVAD
class to your source.
@ThEditor Thank you very much for your reply. I have seen the usage method in the document and tested it, but there is something wrong with the printout here. It should not be just composed of 0 and 255. Can you help me see it?
const options = {
sampleRate: 16000, // Sample rate of input audio
minBufferDuration: 1, // minimum audio buffer to store
maxBufferDuration: 5, // maximum audio buffer to store
overlapDuration: 0.1, // how much of the previous buffer exists in the new buffer
silenceThreshold: 0.5, // threshold for ignoring pauses in speech
frameSamples: 512, // frameSamples buffer
positiveSpeechThreshold: 0.7,
// negativeSpeechThreshold: 0.7,
redemptionFrames: 10,
preSpeechPadFrames: 5,
minSpeechFrames: 30,
submitUserSpeechOnPause: true,
};
const rtvad = new vad.RealTimeVAD(/** options */ options);
rtvad.on("data", (data) => {
console.log("data", Buffer.from(data.audio).toJSON())
});
data {
type: 'Buffer',
data: [
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 255, 255, 255, 255,
255, 255, 255, 255, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
255, 255, 255, 255,
... 27548 more items
]
}
Can you show me how exactly are you passing data to RealTimeVAD
?
(the part that calls the processAudio
function)
@ThEditor The chunk is the data transmitted by the microphone, the length is 256, and I will do some processing on the data.
const BUFFER_SIZE = 1536;
let bufferArr = Buffer.alloc(0);
// The length of the chunk is 256, which is spliced into 1536 here.
async function receiveAudioChunk(chunk) {
bufferArr = Buffer.concat([bufferArr, chunk]);
if (bufferArr.length >= BUFFER_SIZE) {
await rtvad.processAudio(bufferArr)
bufferArr = Buffer.alloc(0); // clear buffer
}
}
I want to access my original data source during this period
rtvad.on("data", (data) => {
console.log("data", Buffer.from(data.audio).toJSON())
});
I think it's better if we discuss this either on an issue in my fork or discord (id: theditor
).
So I tried using
NonRealTimeVAD
but my use case required a real-time version of it.I've created a fork that adds this functionality but I've never really worked with
playwright
tests, so I wasn't able to open a pull request.I've added
RealTimeVAD
class which builds on top ofNonRealTimeVAD
. Let me know if this change is something that can be pulled in (also, I need help with the playwright tests :sob: )I've manually tested it using node-record-lpcm16.