There are a few problems. First, the threads example has been broken since 8de3237f2f095d153376ca74fdf2b6d784d310ac:
error[E0308]: mismatched types
--> examples\threads.rs:54:37
|
54 | let stream = Stream::from_model(Arc::clone(&model)).expect("failed to create stream");
| ------------------ ^^^^^^^^^^^^^^^^^^
| | |
| | expected `&mut Model`, found struct `Arc`
| | help: consider mutably borrowing here: `&mut Arc::clone(&model)`
| arguments to this function are incorrect
|
= note: expected mutable reference `&mut Model`
found struct `Arc<Model>`
note: associated function defined here
--> C:\Users\jay\other-projects\coqui-stt\src\stream.rs:38:12
|
38 | pub fn from_model(model: &'a mut Model) -> crate::Result<Stream<'a>> {
| ^^^^^^^^^^
The commit fixes a UB problem, which is great, but it also creates a new one: Stream takes an exclusive reference to Model, but Model cannot outlive the stack frame without heap allocating and leaking the box.
So, i.e., this will work, but it is not ideal:
diff --git a/examples/threads.rs b/examples/threads.rs
index f592442..63edc79 100644
--- a/examples/threads.rs
+++ b/examples/threads.rs
@@ -9,7 +9,6 @@ use std::env::args;
use std::fs::File;
use std::path::Path;
use std::sync::mpsc::channel;
-use std::sync::Arc;
fn main() {
// this is copied and pasted from the basic_usage example
@@ -40,6 +39,7 @@ fn main() {
}
let mut m = Model::new(model_name.to_str().expect("invalid utf-8 found in path")).unwrap();
+ let sample_rate = m.get_sample_rate();
// enable external scorer if found in the model folder
if let Some(scorer) = scorer_name {
let scorer = scorer.to_str().expect("invalid utf-8 found in path");
@@ -48,10 +48,7 @@ fn main() {
}
// create a Stream
- // wrap the Model in an Arc: note this makes the model immutable forever, so do any changes to its options before doing this!
- let model = Arc::new(m);
- // create the Stream
- let stream = Stream::from_model(Arc::clone(&model)).expect("failed to create stream");
+ let stream = Stream::from_model(Box::leak(Box::new(m))).expect("failed to create stream");
// you can do this construction anywhere
// here, we'll do it in the main thread and send the stream to a background thread
@@ -76,7 +73,7 @@ fn main() {
let src_sample_rate = desc.sample_rate();
// keep in mind this is in an Arc, so this is immutable now
- let dest_sample_rate = model.get_sample_rate() as u32;
+ let dest_sample_rate = sample_rate as u32;
// Obtain the buffer of samples
let mut audio_buf: Vec<_> = if src_sample_rate == dest_sample_rate {
reader.samples().map(|s| s.unwrap()).collect()
Another way to do this without leaking is moving the Model into the thread first, and then creating the stream. But this does not work with the cpal API, which uses callbacks (much to my chagrin):
// TODO: We have to leak the model to extend its lifetime to 'static.
// It's the only way to satisfy the `coqui-stt` API with cpal callbacks.
let mut stream = Stream::from_model(Box::leak(model)).unwrap();
device
.build_input_stream(
&config,
move |data: &[i16], _: &InputCallbackInfo| {
stream.feed_audio(data);
if let Ok(text) = stream.intermediate_decode() {
tx.try_send(text).unwrap();
}
},
move |err| {
panic!("cpal Error: {err}");
},
)
.unwrap();
I should also mention yet another way to address this without leaking is with more channels. One Sender/Receiver pair for piping the cpal input stream to a thread that owns the Model and Stream. And a second channel pair to pipe the transcribed text to its final destination.
If possible, Stream could own the Model instead of borrowing.
Unfortunately, the transcription quality is not great with coqui, so I don't think we can use it. But since I noticed this issue while evaluating, I figured I would report my findings.
There are a few problems. First, the
threads
example has been broken since 8de3237f2f095d153376ca74fdf2b6d784d310ac:The commit fixes a UB problem, which is great, but it also creates a new one:
Stream
takes an exclusive reference toModel
, butModel
cannot outlive the stack frame without heap allocating and leaking the box.So, i.e., this will work, but it is not ideal:
Another way to do this without leaking is moving the
Model
into the thread first, and then creating the stream. But this does not work with thecpal
API, which uses callbacks (much to my chagrin):I should also mention yet another way to address this without leaking is with more channels. One Sender/Receiver pair for piping the cpal input stream to a thread that owns the
Model
andStream
. And a second channel pair to pipe the transcribed text to its final destination.If possible,
Stream
could own theModel
instead of borrowing.Unfortunately, the transcription quality is not great with coqui, so I don't think we can use it. But since I noticed this issue while evaluating, I figured I would report my findings.