Open twitchyliquid64 opened 8 months ago
No repro on 1.76. Have you tried upgrading your compiler?
Couldn't reproduce the SIGSEGV on any of:
b50a77c03d640716296021ad58950b1bb0345799
)The build is indeed slow on all of these.
Do you mind sharing the nix recipe for how you got that compiler?
Also, do you have anything in your ~/.cargo/config.toml
?
Nothing in config.toml, and it’s just the compiler on the unstable channel.
@GrigorenkoPV Just to check, does the 1.75.0 compiler you used also have 82e1608dfa6e0b5569232559e3d385fea5a93112 in the rustc --version --verbose
hash?
@twitchyliquid64 Thank you for reporting. It seems the segfault's cause is ephemeral, unfortunately. You also reported a performance issue. We can at least take a gander at that. Can you get the data from RUSTFLAGS="-Zself-profile" cargo +nightly rustc --release
(or an equivalent command with the RUSTFLAGS enabled) for this build?
@GrigorenkoPV Just to check, does the 1.75.0 compiler you used also have 82e1608 in the
rustc --version --verbose
hash?
Yes. The nix store path also seems to match.
I got a SIGSEGV and the compiler told me to file this issue.
Running again, I no longer get the segv but it also took like 4.5 minutes to build the release binary
Meta
rustc --version --verbose
:Repro instructions
1. Checkout sonos/tract at `f48c24fe8a7b6b2b6dc9adfc838222f728272856` 2. Paste the following code in as a replacement to `examples/onnx-mobilenet-v2/main.rs` and add a dep for circular-buffer = "0.1" ```rust use circular_buffer::CircularBuffer; use tract_onnx::prelude::*; fn samples_from_stdin( stdin: &mut std::io::StdinLock, ) -> tract_ndarray::ArrayBase, tract_ndarray::Dim<[usize; 2]>> {
tract_ndarray::Array2::from_shape_fn((1, 1280), |(_, c)| {
use std::io::Read;
let mut buffer = [0u8; std::mem::size_of::()];
stdin.read_exact(&mut buffer).unwrap();
let sample = i16::from_le_bytes(buffer);
sample as f32
})
}
/// Number of spectograms we track, and the minimum input to the embedding model
const NUM_SPECTOGRAMS: usize = 76;
/// Number of embeddings we track, and the minimum input to the wakeword model
const NUM_EMBEDDINGS: usize = 16;
#[derive(Default, Clone, Debug)]
struct Melspectogram([f32; 32]);
impl Melspectogram {
pub fn iter_mut(&mut self) -> core::slice::IterMut<'_, f32> {
self.0.iter_mut()
}
pub fn iter(&self) -> core::slice::Iter<'_, f32> {
self.0.iter()
}
}
#[derive(Clone, Debug)]
struct Embedding([f32; 96]);
// derive(Default) doesnt work on arrays > 32, grrrr
impl Default for Embedding {
fn default() -> Self {
Self([0f32; 96])
}
}
impl Embedding {
pub fn iter_mut(&mut self) -> core::slice::IterMut<'_, f32> {
self.0.iter_mut()
}
pub fn iter(&self) -> core::slice::Iter<'_, f32> {
self.0.iter()
}
}
/// arecord -r 16000 -f S16_LE | cargo run
fn main() -> TractResult<()> {
let spec_model = tract_onnx::onnx()
// load the model
.model_for_path("melspectrogram.onnx")?
.into_optimized()?
.into_runnable()?;
let embedding_model = tract_onnx::onnx()
// load the model
.model_for_path("embedding_model.onnx")?
.with_input_fact(0, f32::fact([1, 76, 32, 1]).into()).unwrap()
.into_optimized()?
.into_runnable()?;
let final_model = tract_onnx::onnx()
// load the model
.model_for_path("hey_rhasspy_v0.1.onnx")?
.into_optimized()?
.into_runnable()?;
let mut spectograms = CircularBuffer::::new();
let mut embeddings = CircularBuffer::::new();
let mut stdin = std::io::stdin().lock();
for _ in 0..(4 * 16000 / 1280) {
let samples: Tensor = samples_from_stdin(&mut stdin).into();
// run the spectogram on the input
let out = spec_model.run(tvec!(samples.into()))?.remove(0);
// so the spectogram output is [1, 1, 5, 32] but we only care about each 32-float sequence,
// each of which represents a spectogram. Lets iterate in those chunks and add it to our buffer.
for chunk in out.as_slice::().unwrap().chunks(32) {
let mut out = Melspectogram::default();
chunk.into_iter().zip(out.iter_mut()).for_each(|(input, output)| {
// Don't h8 this is what openWakeWords does! https://github.com/dscripka/openWakeWord/blob/main/openwakeword/utils.py#L180
// ¯\_(ツ)_/¯ ¯\_(ツ)_/¯ ¯\_(ツ)_/¯ ¯\_(ツ)_/¯
*output = *input / 10.0 + 2.0;
});
spectograms.push_back(out);
}
// Don't compute the embeddings unless we have a full set of input (76 spectograms)
// for the model
if !spectograms.is_full() {
continue;
}
// Build a tensor that will be the input to the embedding model, which is [?, 76, 32, 1].
// I presume that means [batch_size=1, num_melspectograms=76, num_spect_bins=32, ?].
let embedding_input: Tensor = tract_ndarray::Array::>::from_iter(
spectograms.iter().map(|spect| spect.iter()).flatten().copied(),
).into_shape((1, 76, 32, 1))?.into();
// println!("model: {:?}", embedding_model.model());
// Compute the embedding for this chunk of spectograms.
let out = embedding_model.run(tvec!(embedding_input.into()))?.remove(0);
// so the embedding output is [1, 1, 1, 96], lets collect that into an Embedding struct
// and push it into our embedding buffer.
let mut embedding = Embedding::default();
embedding.0.clone_from_slice(out.as_slice::().unwrap());
embeddings.push_back(embedding);
// Don't compute the features unless we have a full set of input (16 embeddings)
if !embeddings.is_full() {
continue;
}
// Build a tensor that will be the input to the feature model, which is [1, 16, 96].
let feature_input: Tensor = tract_ndarray::Array::>::from_iter(
embeddings.iter().map(|spect| spect.iter()).flatten().copied(),
).into_shape((1, 16, 96))?.into();
let out = final_model.run(tvec!(feature_input.into()))?.remove(0);
println!("{:?}", out);
}
Ok(())
}
```
3. cd to examples/onnx-mobilenet-v2 and do cargo run --release, observe it segfault once