Closed rvryan67 closed 4 months ago
Hi, which issue are you having with v5? Can you post some reproducible code which causes an error?
import torch
import torchaudio
import uuid
import os
import urllib
import ffmpeg
vad, utils = torch.hub.load( repo_or_dir="snakers4/silero-vad", model="silero_vad", onnx=False )
def speechonly(wavfile, utils, vad):
(get_speech_timestamps, save_audio, read_audio, VADIterator, collect_chunks) = utils
VAD_SR = 16000
vad_threshold = 0.4
tmpAudioFile = "/tmp/" + str(uuid.uuid4()) + ".wav" # create wav file from audio_string
wav = read_audio(wavfile, sampling_rate=VAD_SR)
t = get_speech_timestamps(wav, vad, sampling_rate=VAD_SR, threshold=vad_threshold, min_speech_duration_ms=250) # Returns list with segments of audio timestamps (start and end)
print(t)
chunks = []
chunk_probs = []
for i in range(len(t)):
t[i]["start"] = max(0, t[i]["start"] - 3200) # 0.2s head
t[i]["end"] = min(wav.shape[0] - 16, t[i]["end"] + 20800) # 1.3s tail
if i > 0 and t[i]["start"] < t[i - 1]["end"]:
t[i]["start"] = t[i - 1]["end"] # Remove overlap
chunk_duration = t[i]["end"]-t[i]["start"]
if chunk_duration >= 512: # 512 is minimum size to pass through model at 16000 Hz sample rate
speech_probability = vad(wav[t[i]["start"]:t[i]["end"]], VAD_SR).item()
chunk = wav[t[i]["start"]:t[i]["end"]]
chunks.append(chunk)
chunk_probs.append(speech_probability)
logger.info("speechonly len(chunks): " + str(len(chunks)) + ", max(chunk_probs): " + str(max(chunk_probs)) + ", vad_threshold: " + str(vad_threshold))
if len(chunks) == 0 or max(chunk_probs) < vad_threshold: # No speech segments detected or maximum segment probability is below threshold
return wavfile, t
else:
combined_chunks = torch.cat(chunks) # Combine audio segments into one tensor
save_audio(tmpAudioFile, combined_chunks, sampling_rate=VAD_SR) # Save combined tensor to audio file with non-speech removed
return tmpAudioFile, t
def urlToWav(inputUrl, outputfile):
try:
if os.path.isfile(outputfile):
os.remove(outputfile)
dowloadfile = '/tmp/'+os.path.basename(inputUrl)
urllib.request.urlretrieve(inputUrl, dowloadfile)
(
ffmpeg.input(dowloadfile)
.output(outputfile, acodec='pcm_s16le', ac=1, ar=16000)
.run(capture_stdout=True, capture_stderr=True)
)
except Exception as e:
print("failed to convert to WAV - ERROR: " + str(e))
return ""
finally:
if os.path.exists(dowloadfile):
os.remove(dowloadfile)
return outputfile
audioUrl = 'audioUrl = 'https://file-examples.com/storage/fe0ebbce85667e496a17872/2017/11/file_example_MP3_2MG.mp3''
tmpAudioFile = "/tmp/" + str(uuid.uuid4()) + ".wav" # create wav file from s3 bucket
urlToWav(audioUrl, tmpAudioFile)
speechOnlyFile = tmpAudioFile
speechOnlyFile, voicetimestamps = speechonly(tmpAudioFile, utils, vad)
ERROR: Provided number of samples is 27936 (Supported values: 256 for 8000 sample rate, 512 for 16000)
Hi, this is correct behavior, the VAD always had limitations regarding the chunk size, and now the chunk size is fixed as noted in the error message.
Also probably a more proper way to hack into probabilities would be just to extend the get_speech_timestamps function.
The code worked up to recently, it's broken since v5 released yesterday.
Is there a way I can load the previous version to quickly fix the problem until I have time to fix properly?
Your code ran, but it produced incorrect results since vad never worked with such large chunks.
In your case v4.0 does not load because it looks like pytorch caches the hubconf file or the full repo.
From a fresh environment any version loads.
We removed old unused utils in 5.0, so after removing cache everything should work.
This issue (The provided filename /root/.cache/torch/hub/snakers4_silero-vad_master/files/silero_vad.jit does not exist
) is likely caused by this line:
https://github.com/snakers4/silero-vad/blob/v4.0/hubconf.py#L38
From the line above, the model attempts to be loaded from snakers4_silero-vad_master
. However, running torch.hub.load(repo_or_dir="snakers4/silero-vad:v4.0", model="silero_vad", onnx=False )
, i.e. specifying a version number, will put the model + code in /root/.cache/torch/hub/snakers4_silero-vad_v4.0
instead.
I see in the latest release (v5.0), this snakers4_silero-vad_master
isn't hard coded in the model loading step.
https://github.com/snakers4/silero-vad/blob/v5.0/hubconf.py#L43
@snakers4 could the loading JIT model code in v4.0 be updated to match what is in v5.0? Otherwise I think trying to download v4.0 will continue to have this issue.
i.e. update hubconf.py
for v4.0 to this:
instead of this:
This issue (The provided filename /root/.cache/torch/hub/snakers4_silero-vad_master/files/silero_vad.jit does not exist) is likely caused by this line:
Many thanks, we arrived at the same conclusion. Hence the issue with "non-clean" initialization, when the init is "tainted" with loading several versions at once.
We are thinking now how to fix git history properly, there are 3 versions now - v5.0, v4.0 and v3.1 that people remember.
Ideally, ofc, we would deprecate the old ones, but being able to load the earlier model easily on a non-clean environment is a nice feature, e.g. for benchmarking.
@snakers4 , if possible please don't deprecate the old versions yet. I use version 3.1 for transcribing long form anime movies and so far it works best. Thanks.
A possible solution would be to create historic branches for v4
and v3.1
and try to re-tag the tags to use these branches' commits.
If this works, it will be and easy fix.
fixed v3.1 and v4.0 tags they should work properly now
tags for v3.1
and v4
are updated to load predictably for older version on non-clean installations
the only downside is that this solution may not work properly for windows
if so, a PR would be appreciated for this line https://github.com/snakers4/silero-vad/blob/master/hubconf.py#L39
Please can someone verify that this now works.
Hello
I just tried, I got
ImportError: cannot import name 'get_number_ts' from 'utils_vad' (C:\Users\gaeta/.cache\torch\hub\snakers4_silero-vad_master\utils_vad.py)
When using model, _ = torch.hub.load( repo_or_dir='snakers4/silero-vad:v4.0', model='silero_vad', force_reload=True )
I'm on Windows
Hello
I just tried, I got
ImportError: cannot import name 'get_number_ts' from 'utils_vad' (/root/.cache/torch/hub/snakers4_silero-vad_master/utils_vad.py)
When using
model, _ = torch.hub.load( repo_or_dir='snakers4/silero-vad:v4.0', model='silero_vad', force_reload=True )
Hi Try running this code before loading vad model to overcome module collision
import sys
try:
sys.modules.pop('utils_vad')
except:
pass
It worked ! I tried to load both v4 and v5 in the same jupyter notebook (I wanted to make a benchmark between both versions), that's why I got this conflict.
@dgoryeo @rvryan67 @ggoedde @helloWorld199 @hungiito
please verify that these fixes work for you
I can report that the fixes work. I tried V4 in colab environment and V3.1 in Windows environment (after cleaning the local cache). Thanks @snakers4 !
Confirmed that v4.0 works in Databricks environment. Thanks!
Looks like that we have 3 confirmations. If the issue persists for someone, please open a new ticket.
❓ Questions and Help
I'm having issues with latest version v5.0
Until I get time to investigate and fix the issue I want to use the previous version,
vad, utils = torch.hub.load( repo_or_dir="snakers4/silero-vad:v4.0", model="silero_vad", onnx=False )
This results in the following error:
The provided filename /root/.cache/torch/hub/snakers4_silero-vad_master/files/silero_vad.jit does not exist