Returning real time waveform

m15kh commented 2 weeks ago

Hi,

Regarding this issue: link, it returns annotations in real time.

How can I achieve the same for sound? I want to save the exact waveform in WAV format in real time.

yeahphil commented 2 weeks ago

That audio is what you're feeding in to the model -- you'll want to tee it before, not try and get audio back out alongside predictions.

m15kh commented 2 weeks ago

@yeahphil Please see the image below: In the image above, my custom model detects something as I intended. However, I would like the model to save the exact audio corresponding as wav or mp3 format to the predicted label (highlighted in blue) when it makes a prediction.

How can I achieve this?

juanmc2005 commented 2 weeks ago

Is this a duplicate of #246 ? I just posted an answer there that should help you get the bits of audio alongside the predictions. Let me know how it works.

m15kh commented 1 week ago

@juanmc2005
Yes it's duplicate question becuase i thought i haden't explained well in issue 246

first of all thanks for your answering

i did this i now i can save a segment audio that model predicted

count = 0
def use_real_time_prediction(results):
    global count 
    prediction, waveform = results  

    if prediction:
        #NOTE for save audio
        filename = f'output/waveform{count}.wav'  
        sf.write(filename, waveform.data, samplerate=16000)  
        print(f"Waveform saved to {filename}")
        count += 1  

config = PipelineConfig
path_model  = 'checkpoint.ckpt'
config = VoiceActivityDetectionConfig(segmentation=m.SegmentationModel.from_pyannote(path_model))
pipeline = VoiceActivityDetection(config=config)
mic = MicrophoneAudioSource()
inference = StreamingInference(pipeline, mic, do_plot=False, do_profile=True) 

inference.attach_hooks(use_real_time_prediction)

total_prediction = inference()

m15kh commented 1 week ago

@juanmc2005 i also added this feature and sent request for pull requestslink

juanmc2005 / diart

Returning real time waveform #250