dhrebeniuk / RosaKit

LibRosa port to Swift for ability using same prepossessing logic in iOS/MacOS platforms
MIT License
83 stars 14 forks source link

Recording Audio Missing from MicroReader (2048*20) #5

Closed jeenezh closed 3 years ago

jeenezh commented 3 years ago

Hello, thanks for the great project. I deployed the soundRecognizer project in iOS and use an AVAudio Engine to write out the recorded files. There are parts of audio missing from the recorded outputs. Someone said it is because "AVAssetReader seems to discard the first packet, leaving you the second packet, whose presentation timestamp is 1024, and you need only discard 2112 - 1024 = 1088 of the decoded frames."
I was wondering if it is because of the window length of buffer, why do you set the format of windowlength as 2048*20? not a summed integer (40960)?

A bit background about my project if you are interested: I have tried to deploy a ML model in the project to predict coughing sound and I also wanna record the sounds in order to run those recorded files in Python to compare. It is my thesis and I have been struggling for a while and that would be highly appreciated if you could give a bit more info about how you set up the buffer. Merci!

dhrebeniuk commented 3 years ago

@jeenezh , in one CMSampleBuffer for 44kHz, is 2080 bytes.

My model in application requires 128x81 matrix, which transformed from 20 samples(sample it's audio data item 2080 bytes).