Open pique0822 opened 1 week ago
How'd we use streaming based on the refactors in this PR?
I think using .transcribe from ReverbASR
from asr.wenet.cli.reverb
is the way to go with simulate_streaming.
Although its still not clear
It'll help if you could add an example for in standalone streaming.py or pseudocode in a readme with
model = how to initialize model for streaming ()
for audio_chunk in audio_stream:
transcript_segment = how_to_call_rev_model_in_streaming_context(audio_chunk)
How'd we use streaming based on the refactors in this PR? I think using .transcribe from
ReverbASR
fromasr.wenet.cli.reverb
is the way to go with simulate_streaming. Although its still not clear It'll help if you could add an example for in standalone streaming.py or pseudocode in a readme withmodel = how to initialize model for streaming () for audio_chunk in audio_stream: transcript_segment = how_to_call_rev_model_in_streaming_context(audio_chunk)
I can definitely provide some guidance on how to setup streaming -- just to respond here though, from my view it won't be initialized or run any different! The key thing will be to follow what you have in your example: load the model once and then just call .transcribe
on each audio chunk.
Motivation
Reverb models currently requires a few steps to use.
We should have a simpler way for users to load the model for transcription.
Outcomes of this PR
PIP-able Package for ASR
The
pyproject.toml
file is updated inasr
so that runningpip install
will install thereverb
package in your python environment. This will make it easier to interact with reverb code from anywhere.ReverbASR
This PR introduces the ReverbASR class which will setup all necessary files in in an object that a user can use to then transcribe recordings anywhere using
.transcribe
or.transcribe_modes
. These functions also give users the full flexibility of modifying the output thatrecognize_wav.py
does.Automatic Model Downloading
Assuming you have setup your huggingface CLI, you can now use
mdl = load_model("reverb_asr_v1")
to download the reverb model to your home cache~/.cache/reverb
. This will make loading the model easier in the future as well once it's been downloaded once.recognize_wav.py
->reverb
This PR updates
recognize_wav.py
to use the new ReverbASR class and includes it as a binary within thereverb
package. Now you can callpython wenet/bin/recognize_wav.py
within theasr
directory orreverb
from anywhere. All previous behavior is retained however a new argument--model
is added that allows a user to specify either the path to a reverb model directory that contains the checkpoint and config or the name of a pretrained reverb_asr model (for now that's onlyreverb_asr_v1
)Examples
Simple transcribe
this is equivalent to:
Transcribe Nonverbatim
this is similar to:
Transcribe Multiple Modes
this is similar to: