snakers4 / silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector
MIT License
4.41k stars 432 forks source link

Feature request - Add a callback function for tracking the progress of the `get_speech_timestamps` function. #281

Closed saenyakorn closed 1 year ago

saenyakorn commented 1 year ago

🚀 Feature

Add a callback function for tracking the progress of the get_speech_timestamps function that the first parameter is progress, calculated from current sample index / sample length * 100.

Motivation

Hi Silero VAD team. I have been playing with your model for chunking lecture audio recently. The model works well and accurately, but there's something missing for the get_speech_timestamps utils. Most of the lecture audio is around 1.5 - 3 hours. So, the get_speech_timestamps function takes about 8-12 minutes to get the timestamps. But, I wonder how far did the progress go. There's no any logging and feedback during getting the timestamps.

Pitch

I think it would be great if I can pass a callback function to get_speech_timestamps like this, to log the progress.

def tracking_progress(progress: float) -> None:
    # other logic like storing the progress into the database
    logger.debug(progress)

speech_timestamps = get_speech_timestamps(
            audio=wav_file, 
            model=model, 
            sampling_rate=sampling_rate,
            callback=tracking_progress
        )

And I think it'll be not convenient if you decide to add a logger parameter to the get_speech_timestamps function. Since I also want to collect the progress into my database. I can not do this using a logger. For flexibility, I think a callback function would be great for everyone.

Thank you for your consideration.

snakers4 commented 1 year ago

Hi,

This makes sense, but in most cases we just use off-the-shelf modules like tqdm to visualize progress. I would not like to add tqdm as a dependency, but adding a callback makes sense, so that the user could just pass a function. The get_speech_timestamps has 3 for loops inside, but most likely 95% of time is consumed by applying the actual model.

In any case - a PR for such a feature would be appreciated.

saenyakorn commented 1 year ago

Sure. I'll make a PR for this feature soon. Btw, how can I contribute to this repo? I don't see the requirements.txt or other instructions document.

snakers4 commented 1 year ago

I don't see the requirements.txt or other instructions document.

If I understand your question correctly, the pytorch VAD and utils just require PyTorch and python standard library. Since the VAD is pulled using torch.hub it is safe to assume that you already have PyTorch.

snakers4 commented 1 year ago

Let's continue the discussion here - https://github.com/snakers4/silero-vad/pull/282 if necessary