Use silero v3.1 - Githubissues

linto-ai / whisper-timestamped

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

GNU Affero General Public License v3.0

1.87k stars 149 forks source link

Use silero v3.1 #142

Closed Jeronymous closed 9 months ago

Jeronymous commented 10 months ago

This seems to improve VAD. See problems spotted in https://github.com/linto-ai/whisper-timestamped/issues/74

In the figure below:

left: VAD with silero-vad v4.0 (latest)
right: VAD with silero-vad v3.1

Notes:

versions of silero: https://github.com/snakers4/silero-vad/wiki/Version-history-and-Available-Models

Jeronymous commented 9 months ago

In the end I implemented the choice of the VAD method. Default remains the same (silero latest / 4.0), but former versions of silero can be specified (e.g. "silero:3.1"). And also "auditok" can be used. See this PR that I'm gonna close: https://github.com/linto-ai/whisper-timestamped/pull/78