Closed zxcqirara closed 9 months ago
Now I have replaced it to rustpotter.process_samples(data.to_vec())
but now it don't recognize wake-word, but rustpotter cli do
Sorry for the late reply, yes the process_samples
method is what you need to use when you have the audio data already decoded as numbers.
If you are using int 16 samples you also need to set the config to:
config.fmt.sample_rate = ...;
config.fmt.sample_format = SampleFormat::I16;
config.fmt.channels = ...;
And the size of the chunks provided to the process_samples
method should be match the return of the rustpotter.get_samples_per_frame()
if not they are ignored.
I think that is all you need to take into account.
Yeah, I'll do it. Now I have the problem that rustpotter doesn't recognize my words (but rustpotter-cli do), I tried default settings and my own, but it didn't affect
Sorry for the late reply, yes the
process_samples
method is what you need to use when you have the audio data already decoded as numbers.If you are using int 16 samples you also need to set the config to:
config.fmt.sample_rate = ...; config.fmt.sample_format = SampleFormat::I16; config.fmt.channels = ...;
And the size of the chunks provided to the
process_samples
method should be match the return of therustpotter.get_samples_per_frame()
if not they are ignored.I think that is all you need to take into account.
I have done as you wrote, but it still doesn't recognize anything
There is my config:
There is my buffer:
I have no clue what can be wrong. You can try to verify if you are correctly feeding it by using the "record" feature and setting a low threshold, like 0.01. Or trying to create a wav file using hound as rustpotter does.
Also I suggest you to disable the audio filter until you get it to work.
listener:: rustpotter:: get_samples();
That is not the library method.
I have no clue what can be wrong. You can try to verify if you are correctly feeding it by using the "record" feature and setting a low threshold, like 0.01. Or trying to create a wav file using hound as rustpotter does.
Also I suggest you to disable the audio filter until you get it to work.
listener:: rustpotter:: get_samples();
That is not the library method.
It is my function
And I can't see any files or even folder that i specified in record_path
The record folder need to exists and be writable.
But if you don get any detections after lowering the threshold it will no record, it just takes records on partial detections.
I got many short WAVs with strange flicking sound
My detections in rustpotter-cli: Works correctly (idk why I decided to blur it)
I got many short WAVs with strange flicking sound
Then you should be doing something wrong with the audio data or the format is configured incorrectly. If you use the record option with the cli you will see the records are audibles.
I have taken source from another project and just upgraded rustpotter from 2.0.1 to 3.0.1, in the previous version all was working correctly
I have taken source from another project and just upgraded rustpotter from 2.0.1 to 3.0.1, in the previous version all was working correctly
If you want to send me the code diffs, maybe there is something I haven't got correctly, until now I haven't found any problems migrating the things I was using to the v3, but maybe I'm missing something.
I have taken source from another project and just upgraded rustpotter from 2.0.1 to 3.0.1, in the previous version all was working correctly
If you want to send me the code diffs, maybe there is something I haven't got correctly, until now I haven't found any problems migrating the things I was using to the v3, but maybe I'm missing something.
Ok, cargo.toml:
rustpotter = "2.0.0"
to
rustpotter = { git = "https://github.com/GiviMAD/rustpotter", features = ["record"] }
Rustpotter config:
pub const RUSTPOTTER_DEFAULT_CONFIG: Lazy<RustpotterConfig> = Lazy::new(|| {
RustpotterConfig {
fmt: WavFmt::default(),
detector: DetectorConfig {
avg_threshold: 0.,
threshold: 0.5,
min_scores: 15,
score_mode: ScoreMode::Average,
comparator_band_size: 5,
comparator_ref: 0.22
},
filters: FiltersConfig {
gain_normalizer: GainNormalizationConfig {
enabled: true,
gain_ref: None,
min_gain: 0.7,
max_gain: 1.0,
},
band_pass: BandPassConfig {
enabled: true,
low_cutoff: 80.,
high_cutoff: 400.,
}
}
}
});
to
pub const RUSTPOTTER_DEFAULT_CONFIG: Lazy<RustpotterConfig> = Lazy::new(|| {
RustpotterConfig {
fmt: AudioFmt::default(),
detector: DetectorConfig {
avg_threshold: 0.,
threshold: 0.5,
min_scores: 15,
score_mode: ScoreMode::Average,
eager: false,
band_size: 5,
score_ref: 0.22,
vad_mode: None,
record_path: Some(String::from("./recs")),
},
filters: FiltersConfig {
gain_normalizer: GainNormalizationConfig {
enabled: true,
gain_ref: None,
min_gain: 0.7,
max_gain: 1.0,
},
band_pass: BandPassConfig {
enabled: true,
low_cutoff: 80.,
high_cutoff: 400.,
}
}
}
});
Rustpotter init:
pub fn init() -> Result<(), ()> {
let rustpotter_config = config::RUSTPOTTER_DEFAULT_CONFIG;
// create rustpotter instance
match Rustpotter::new(&rustpotter_config) {
Ok(mut rinstance) => {
// success
// wake word files list
// @TODO. Make it configurable via GUI for custom user voice.
let rustpotter_wake_word_files: [&str; 5] = [
"rustpotter/jarvis-default.rpw",
"rustpotter/jarvis-community-1.rpw",
"rustpotter/jarvis-community-2.rpw",
"rustpotter/jarvis-community-3.rpw",
"rustpotter/jarvis-community-4.rpw",
// "rustpotter/jarvis-community-5.rpw",
];
// load wake word files
for rpw in rustpotter_wake_word_files {
rinstance.add_wakeword_from_file(rpw).unwrap();
}
// store
RUSTPOTTER.set(Mutex::new(rinstance));
},
Err(msg) => {
error!("Rustpotter failed to initialize.\nError details: {}", msg);
return Err(());
}
}
Ok(())
}
to
pub fn init() -> Result<(), ()> {
let rustpotter_config = config::RUSTPOTTER_DEFAULT_CONFIG;
// create rustpotter instance
match Rustpotter::new(&rustpotter_config) {
Ok(mut rinstance) => {
// success
// wake word files list
rinstance.add_wakeword_from_file("first", "rustpotter/first.rpw").unwrap();
rinstance.add_wakeword_from_file("second", "rustpotter/second.rpw").unwrap();
rinstance.add_wakeword_from_file("third", "rustpotter/third.rpw").unwrap();
// store
RUSTPOTTER.set(Mutex::new(rinstance));
},
Err(msg) => {
error!("Rustpotter failed to initialize.\nError details: {}", msg);
return Err(());
}
}
Ok(())
}
All the code took from this repo
Well I see several problems there, the audio format was not correctly configured.
At least it should be (this assumes you are using 16000hz, single channel audio).
fmt: AudioFmt {
..Default::default(),
sample_format: rustpotter::SampleFormat::I16,
},
Also the frame size should be correctly initialized (but I assume you already fixed that as you got it to record), or you can use some buffering solution like the one implemented in the rustpotter-cli (because I encounter problems setting the clap buffer size to the required value in some platforms, maybe already fixed).
I was meaning this:
...
buffer.extend_from_slice(data);
while buffer.len() >= rustpotter_samples_per_frame {
let detection = rustpotter.process_samples(
buffer
.drain(0..rustpotter_samples_per_frame)
.as_slice()
.into(),
);
print_detection(
&*rustpotter,
detection,
partial_detection_counter,
debug,
debug_gain,
get_time_string,
);
}
...
It's probably not the most efficient solution, there I'm pushing the audio data at end of a vector and then draining it until the data on it is less that the required chuck size, the buffer should be declared on a parent scope so it's reused between function calls as you do with the rustpotter instance. I think it can fit there so you don't need to change the general frame size.
Omg, I have recreated the project and just edited what I wrote above and it worked... I think I broke sth while editing code last time
Omg, I have recreated the project and just edited what I wrote above and it worked... I think I broke sth while editing code last time
Great to know you managed to make it work.
I encourage you to try to create a trained wakeword model to replace the v2 files you are currently used, it should provide a better functionality than the now called "wakeword references". I have one created with around 2000 samples (200 records of the wakeword + 1800 noise and silence records) and on my experience it work far better, over all in presence of small noises.
Edit: one tip, the record functionality is a great help in order to augment the dataset as the produced records matches the duration of the largest wakeword.
Best regards!
I have audio data in
&[i16]
, butprocess_bytes
accepts only&[u8]
(appeared after upgrading from 2.0.1 to 3.0.1)