Open bdytx5 opened 1 year ago
HI Brett,
Would it be possible to share the swift code for sliding window? Perhaps I can provide a suggestion once I get an idea how it is implemented.
Michael
Here is the code that implements a sliding window to read to store audio.
`
func startMicrophone(){
print("start microphone")
let recordingFrameBuffer = bufferSize/2
var recordingBuffer: [Int16] = []
var inferenceCount: Int = 1
let numOfInferences = self.numOfInferences
let inputSize = self.inputSize
let recordingFormat = AVAudioFormat(commonFormat: .pcmFormatInt16, sampleRate: Double(16000), channels: 1, interleaved: true)
let inputNode = audioEngine.inputNode
let inputFormat = inputNode.outputFormat(forBus: 0)
guard let formatConverter = AVAudioConverter(from:inputFormat, to: recordingFormat!) else {
return
}
var window = [Int16]()
var pcmWindow = [AVAudioPCMBuffer]()
//remove existing tap if any
do {
try audioEngine.inputNode.removeTap(onBus: 0)
}
catch {
print("no tap to remove")
}
// install a tap on the audio engine and loops the frames into recordingBuffer
audioEngine.inputNode.installTap(onBus: 0, bufferSize: AVAudioFrameCount(bufferSize), format: inputFormat) { (buffer, time) in
self.conversionQueue.async {
let pcmBuffer = AVAudioPCMBuffer(pcmFormat: recordingFormat!, frameCapacity: AVAudioFrameCount(self.bufferSize))
var error: NSError? = nil
let inputBlock: AVAudioConverterInputBlock = {inNumPackets, outStatus in
outStatus.pointee = AVAudioConverterInputStatus.haveData
return buffer
}
formatConverter.convert(to: pcmBuffer!, error: &error, withInputFrom: inputBlock)
if error != nil {
print(error!.localizedDescription)
}
else if let channelData = pcmBuffer!.int16ChannelData {
let channelDataValue = channelData.pointee
let channelDataValueArray = stride(from: 0, to: Int(pcmBuffer!.frameLength), by: buffer.stride).map{ channelDataValue[$0] }
if(window.count < 16000){
window.append(contentsOf: channelDataValueArray)
pcmWindow.append(buffer)
}else{
window.append(contentsOf: channelDataValueArray)
pcmWindow.append(pcmBuffer!)
window.removeFirst(1600)
pcmWindow.removeFirst()
//// RUN INFERENCE
self.recognize(onBuffer:window, pcms: pcmWindow)
}
} //channeldata
} //conversion queue
} //installtap
audioEngine.prepare()
do {
try audioEngine.start()
}
catch {
print(error.localizedDescription)
}
}
` Also, here is so kotlin code I found that I modified to run inference at a faster frequency
https://www.tensorflow.org/lite/android/tutorials/audio_classification
by modifying the "interval"
executor = ScheduledThreadPoolExecutor(1) executor.scheduleAtFixedRate( classifyRunnable, 0, interval, TimeUnit.MILLISECONDS)
code in the tutorial, I was able to increase inference speed without implementing a sliding window buffer, however, I was not able to figure out how to do this in the Plugin code
Oh I see how it is implemented.
Though compared with Swift/Kotlin, manipulating arrays in Java is such a pain. :laughing: To achieve window sliding, simply replace the following below with your own code. For example:
Modify the public void splice()
function from the code here.
For example:
public void splice(){
if (record.getState() != AudioRecord.STATE_INITIALIZED) {
Log.e(LOG_TAG, "Audio Record can't initialize!");
return;
}
while (shouldContinue) {
short[] shortData = new short [bufferSize];
record.read(shortData, 0, shortData.length);
recordingBufferLock.lock();
try {
//Add window spliding function here
//When array fills up, call the function `recordingData.emit(<ADD ARRAY HERE>);` to emit the data for recognition
//Clear out array once it’s been emitted, and repeat window sliding
//If you wish stop the recording, call the `stop()` function
} finally {
recordingBufferLock.unlock();
}
}
let me know if this helps.
Ok yeah I'll see if I can piece something together. If not, I may use the kotlin code to put together a new android plugin. Would be happy to share that if so.
No problems. If you do have a solution, the Flutter community and I would appreciate your contribution very much :)
Hey man, thanks for this awesome plugin. I was looking to increase the number of times per second I run my model. I was able to implement a sliding window in the swift code, but I'm not super familiar with Java or Android development. I was wondering if you could provide some suggestions on how to accomplish this?
Thanks, Brett