Closed infernusdomoso closed 7 years ago
You can write whatever you send to the RunDetection
function to a file, that's the raw audio file. You can then use sox to turn it into a wave file with header, e.g., by using the following command:
sox -r 16000 -c 1 -b 16 -e signed-integer -t raw audio.raw -t wav audio.wav
where audio.raw is what you write to the disk.
Awesome, thanks. One more question - how do I extend the amount of data that is sent to RunDetection? I saved the data to disk and converted it with sox, but it's about half a second of my voice - I need to make the buffer larger. I've tried a few things and nothing seems to work.
Nevermind, figured it out. I added a second ringbuffer object that can grow to an unlimited size, then I set a recording flag and it keeps that data, I sleep inside the callback function then save the raw data to disk to process. Thanks for the info on the sox command.
Sorry to bother you infernusdomoso. I assume you are using python. Is it possible to get code snippet for saving raw audio to file? I tried several solutions but all I get is quiet bang. Hot word detection is working, so it's not problem with mic.
One of my tries was (I know it's stupid to open file for writting each loop)... I'm editing snowboydecoder.py and this is loop snippet where I added saving to file: ` is_detected = False
while True:
if interrupt_check():
logger.debug("detect voice break")
break
data = self.ring_buffer.get()
if len(data) == 0:
time.sleep(sleep_time)
continue
ans = self.detector.RunDetection(data)
if ans == -1:
logger.warning("Error initializing streams or reading audio data")
elif ans > 0:
message = "Keyword " + str(ans) + " detected at time: "
message += time.strftime("%Y-%m-%d %H:%M:%S",
time.localtime(time.time()))
logger.info(message)
with open('recorded_sound.raw', 'ab') as f:
f.write(data)
is_detected = True
callback = detected_callback[ans-1]
if callback is not None:
callback()
elif (ans == -2 and is_detected): #
# save file if silence is detected
is_detected = False
message = "Silence detected. Saving to recorded_sound.raw file."
logger.info(message)
with open('recorded_sound.raw', 'ab') as f:
f.write(data)
f.close()
`
tnx
Here is a code snipped that implements what infernusdomoso suggested:
Add the second ring buffer in the init
self.ring_buffer_hot = RingBuffer(4096*8)
#or what dimension you want
Then in the audio_callback() add
self.ring_buffer_hot.extend(in_data)
Finally in the start method, add the following lines in the if callback is not None condition:
data1 = self.ring_buffer_hot.get()
with open('recorded_sound.raw', 'ab') as f:
f.write(data1)
f.close()
The try a hotword and a file called recorded_sound.raw should be created. convert in using the sox command provided by chenguoguo:
sox -r 16000 -c 1 -b 16 -e signed-integer -t raw recorded_sound.raw -t wav audio.wav
aplay audio.wav
Hope it helps
PS: Do you know any python command to convert the audio file to a 16bit 1 channel 16kHz wave file?
EDIT: All right insted of:
with open('recorded_sound.raw', 'ab') as f:
f.write(data1)
f.close()
use:
try:
f = wave.open('recorded_sound.raw', 'wb')
f.setparams((1,2,16000,0,'NONE','NONE'))
f.writeframes(data1)
f.close()
except IOError as e:
print(e)
works flawlessly
@GiovanniBalestrieri Great! Would you like to submit a PR to https://github.com/Kitt-AI/snowboy/tree/master/examples/Python?
I'm pretty new with audio data, and I'm trying to either have the software break and release the microphone when the wake word is heard, so the audio can be recorded and saved, or maybe easier, start saving the stream directly. I've tried using the wave library to save the data as a wav file but it's all garbled noise, so I'm thinking the 'data' object that is the stream audio is not in the right format?
Can someone point me in the right direction for saving the stream audio as a wav file? It always seems to make an 8k file thats just garbled noise. I'm clearly missing something.