Open m15kh opened 1 week ago
That audio is what you're feeding in to the model -- you'll want to tee it before, not try and get audio back out alongside predictions.
@yeahphil Please see the image below: In the image above, my custom model detects something as I intended. However, I would like the model to save the exact audio corresponding as wav or mp3 format to the predicted label (highlighted in blue) when it makes a prediction.
How can I achieve this?
Is this a duplicate of #246 ? I just posted an answer there that should help you get the bits of audio alongside the predictions. Let me know how it works.
@juanmc2005
Yes it's duplicate question becuase i thought i haden't explained well in issue 246
first of all thanks for your answering
i did this i now i can save a segment audio that model predicted
count = 0
def use_real_time_prediction(results):
global count
prediction, waveform = results
if prediction:
#NOTE for save audio
filename = f'output/waveform{count}.wav'
sf.write(filename, waveform.data, samplerate=16000)
print(f"Waveform saved to {filename}")
count += 1
config = PipelineConfig
path_model = 'checkpoint.ckpt'
config = VoiceActivityDetectionConfig(segmentation=m.SegmentationModel.from_pyannote(path_model))
pipeline = VoiceActivityDetection(config=config)
mic = MicrophoneAudioSource()
inference = StreamingInference(pipeline, mic, do_plot=False, do_profile=True)
inference.attach_hooks(use_real_time_prediction)
total_prediction = inference()
Hi,
Regarding this issue: link, it returns annotations in real time.
How can I achieve the same for sound? I want to save the exact waveform in WAV format in real time.