Open rohandeo opened 4 years ago
You can turn on wakeword saving in the mycroft config if you just want to keep the recorded wake word audio. It's incredibly useful for training your model.
@el-tocino I'm using a source install as instructed on the mycroft-precise Github page. Could you tell me where I can find the mycroft config file?
If you installed mycroft locally, there's one in ~/.mycroft/mycroft.conf, or you can edit one in /etc/mycroft/mycroft.conf
I do not have a ~/.mycroft or /etc/mycroft directory. I'm just testing out my trained model for now. Could you elaborate more on your solution?
@el-tocino I have not installed mycroft-core and would like to know if there is a way to solve my issue without installing it. I am only interested in recognizing a word from live stream and I have other computations to run on that word once it is recognized.
Sorry, I only use it with mycroft.
@rohandeo Just to ensure the recording is getting saved properly, try using the mechanism precise has for saving wake words. There are two ways:
precise-listen -s somefolder mymodel.net
: This saves activations to somefolder
precise-collect
: This saves audio to wav files in the same way precise gets data from the microphone.Let me know if you can reproduce the results using either of those methods.
@MatthewScholefield I don't think precise-listen has an option to save activations. This is the list of options that precise-listen supports which I found in the precise-listen script.
:model str
Either Keras (.net) or TensorFlow (.pb) model to run
:-c --chunk-size int 2048
Samples between inferences
:-l --trigger-level int 3
Number of activated chunks to cause an activation
:-s --sensitivity float 0.5
Network output required to be considered activated
:-b --basic-mode
Report using . or ! rather than a visual representation
:-d --save-dir str -
Folder to save false positives
:-p --save-prefix str -
Prefix for saved filenames
Is there some other way to save activated audio?
precise-collect-output.zip Transcript: "bed on karaoke similarity reality"
Not much difference. Data is till lossy. Is the PyAudio module the reason behind the lossy data?
Sorry, I used the wrong flag, it's the -d savedir
flag as you can see in
the info string.
On Tue, May 12, 2020, 8:44 PM Rohan Deo notifications@github.com wrote:
@MatthewScholefield https://github.com/MatthewScholefield I don't think precise-listen has an option to save activations. This is the list of options that precise-listen supports which I found in the precise-listen script.
:model str Either Keras (.net) or TensorFlow (.pb) model to run
:-c --chunk-size int 2048 Samples between inferences
:-l --trigger-level int 3 Number of activated chunks to cause an activation
:-s --sensitivity float 0.5 Network output required to be considered activated
:-b --basic-mode Report using . or ! rather than a visual representation
:-d --save-dir str - Folder to save false positives
:-p --save-prefix str - Prefix for saved filenames
Is there some other way to save activated audio? precise-collect output:
precise-collect-output.zip https://github.com/MycroftAI/mycroft-precise/files/4619327/precise-collect-output.zip Transcript: "bed on karaoke similarity reality"
Not much difference. Data is till lossy. Is the PyAudio module the reason behind the lossy data?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/MycroftAI/mycroft-precise/issues/156#issuecomment-627694855, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABM2KSZR66UQQ73FGOS2XADRRH3ORANCNFSM4M667COA .
Isn't that just for false positives though? @MatthewScholefield
It's saves any activation since it doesn't know which one is a proper activation and which one is a false positive. I suppose this should be corrected in the docstring.
On Tue, May 12, 2020, 10:09 PM Rohan Deo notifications@github.com wrote:
Isn't that just for false positives though? @MatthewScholefield https://github.com/MatthewScholefield
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/MycroftAI/mycroft-precise/issues/156#issuecomment-627717786, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABM2KS5RW2DH6HFDPRDLAATRRIFPTANCNFSM4M667COA .
@MatthewScholefield The precise-listen option worked like a charm. I think the reason behind that was the precise-listen script converts the bytes audio back into float32 (greater precision -> greater clarity). Thanks a lot for your help. P.S. Sorry for the late reply. Finals week.
No problem, I would expect that there might have been some other subtle bug causing the issue with those audio files since they did sound awfully strange, but I'm glad it's resolved. And no worries, I've also been busy with finals.
Hi @rohandeo I have a similar application. I have a custom wakeword and it works well. I want to record the audio after the wakeword and save it to a file. I am very new to using Mycroft. Can you help me with the code ? This is what I have tried to record audio after I detect the wakeword. The output I get is just full of noise.
def _handle_predictions(self):
while self.running:
chunk = self.stream.read(self.chunk_size)
if self.is_paused:
continue
prob = self.engine.get_prediction(chunk)
self.on_prediction(prob)
if self.detector.update(prob):
# self.on_activation()
# record
print("Activated")
# chunk = self.stream.read(self.chunk_size)
frames =[]
for i in range(0, int(16000 / self.chunk_size * 10)):
chunk = self.stream.read(self.chunk_size)
frames.append(chunk)
wf = wave.open("test.wav", 'wb')
wf.setnchannels(6)
wf.setsampwidth(self.pa.get_sample_size(self.pa.get_format_from_width(2)))
wf.setframerate(16000)
wf.writeframes(b''.join(frames))
wf.close()
print("* done recording")
@Dipendra77 Your code works just fine for me, except that I use 1 channel and sample width 2.
Goal
I am trying to capture the audio segment which contains the wakeword from my raw input audio stream from microphone.
Background
I have trained my own model on a private dataset and everything works fine. I'm getting testing accuracy of over 99%. When I use precise-listen to test on live input streams, it gives decent results but I have to be very close to the mic when I speak. If I'm not close (within 1.5 meters and speaking in a normal, conversational volume and pitch), the model does not activate at all.
Initial steps
I have written a small script which creates a GUI and starts the runner when I press a "Start" button and stops it when I press "Stop". I then modified mycroft-precise/runner/precise_runner/runner.py to write out the bytes or "chunks" which were getting activated. I first wanted to check whether I was able to get the audio out or not so I initialized a bytes string called "self.record" and appended all the chunks (regardless of whether they were activated or not) and wrote them out to a file. I am pasting the modified functions inside class PreciseRunner so as to not clutter the issue. The changed lines have been marked.
Code for recording all bytes from mic input
Code for writing the recorded bytes
Data
The live data is very lossy. I'm attaching a sample file which I recorded using this script (I converted from bytes to wav for your convenience). Transcript: "bed on karaoke similarity reality" live_stream_capture.zip I'm using Ubuntu 18.04 (4GB RAM, 4CPUs) inside VirtualBox on a Windows 10 machine
Issues