grame-cncm / faust

Functional programming language for signal processing and sound synthesis
http://faust.grame.fr
Other
2.54k stars 319 forks source link

Rust: Stack overflow in debug with temporal effects #804

Closed plule closed 1 year ago

plule commented 2 years ago

Follow-up to #803 , paging @bluenote10 and @smoothdeveloper who participated on it.

Problem

As soon as a temporal effect is used (delay), a buffer big enough to store it is allocated in an array. Simplified generated code:

struct Dsp {
    pub a: [f32; 1024 * 1024],
}

impl Dsp {
    pub fn new() -> Self {
        Self {
            a: [0.0; 1024 * 1024],
        }
    }
}

This array does not fit in the stack, so allocating it fails with a stack overflow. In the architecture files, this DSP is allocated with Box::new(Dsp::new()). In debug compilation, the value is first allocated on the stack, then moved on the heap, leading to the same stack overflow. Fortunately in Release mode the allocation is optimized to go directly on the stack, so this problem is only appearing in debug. This problem is known in rust: https://github.com/rust-lang/rust/issues/53827

Solution attempt 1

A possible fix is to allocate the arrays directly in a Vec, which is on the heap. Roughly:

struct Dsp {
    pub a: Vec<f32>,
}

impl Dsp {
    pub fn new() -> Self {
        Self {
            a: vec![0.0; 1024 * 1024],
        }
    }
}

This solution can be seen in #803 and solves the problem. Unfortunately, it also comes with its own problems, namely forcing heap allocation even when unnecessary, and fragmenting the memory allocation.

This approach could be revived by only allocating huge arrays in vectors, and keep smaller ones in arrays, but I feel like we would just be "chasing" the solution.

Solution attempt 2

Second attempt: there is an unstable syntax dedicated to allocating on the heap: box_syntax. Unfortunately, it's been unstable for years, and at this point does not solve the stack overflow:

// Ok
let a = box [0.0; 1024 * 1024];
// Stack overflow
let dsp = box Dsp::new();

Solution attempt 3

Third party crates such as default-boxed solve this problem by allocating the memory on the heap, then assigning the initial values there. The implementation includes unsafe code (as any manual allocation), and solves the issue, while retaining the memory layout, without baking the heap allocation in the Dsp:

#[cfg_attr(feature = "default-boxed", derive(default_boxed::DefaultBoxed))]
struct Dsp {
    pub a: [f32; 1024 * 1024],
}

fn main() {
    let dsp = Dsp::default_boxed();
}

I think that this option is nice enough, and easy to implement. The dependency to default-boxed can be made optional by putting this behind an opt-in feature flag (which is what the cfg_attr does).

While this last solution ticks all the boxes, I'm unsure if it's worth the added complexity and possible confusion from the "optional dependency" mechanism. I'd be up to implement it in Faust, Frando/rust-faust and the architecture files if this solution is suitable.

Let me know what you think, maybe I've missed another possible approach.

bluenote10 commented 2 years ago

Just some quick feedback without having thought about it in detail:

Indeed, on first glance default-boxed looks very promising. It could serve as a work-around until Rust has an appropriate native solution. I like that it basically only needs an attribute attached to the dsp struct, so integrating it seems relatively straightforward.

The only thing I cannot judge is how hard it would be pass the flag through faust, i.e., whether it is e.g. okay to introduce a Rust-specific command line option, or if there is a preferred approach from a faust perspective in general.

I'm wondering if this setting can/should even be generalized to "pass any attribute to the generated dsp struct"? This would be more general (enabled perhaps other user specific attributes), but perhaps it is also awkward having to pass-in a valid Rust annotation from the command line... Or could it even be injected from the architecture file?

smoothdeveloper commented 2 years ago

@plule, note that I'm pretty much newbie with rust, nor familiar with faust, but thanks for prompting.

To me the solution 3 seems great:

Is there any advantage for the constructor method to have two code paths depending if using Release or Debug configuration?

From your comment in #803, it seems the Release code works, albeit, it might hit the same problem for allocations that are larger (https://github.com/rust-lang/rust/issues/89908 it is mentioned that stack allocation larger than 2mb are bound to fail).

plule commented 2 years ago

@bluenote10

The only thing I cannot judge is how hard it would be pass the flag through faust, i.e., whether it is e.g. okay to introduce a Rust-specific command line option, or if there is a preferred approach from a faust perspective in general.

Faust itself should always output the same thing that includes the opt-in feature flag in the code. I've pushed my playground here to show what I mean. In this project: default-boxed is an optional dependency, and you can run the project with either cargo run or cargo run --features default-boxed.

Maybe there could be a flag in faust2*rust since these have the option to call cargo build, but that would be really just at the end of the chain.

@smoothdeveloper

Is there any advantage for the constructor method to have two code paths depending if using Release or Debug configuration?

I think that having the two code paths exposed to the end user is good, though it should be controlled explicitly and not depend on the Release/Debug configuration. Which mean that a user can build with cargo build --features default-boxed, cargo build, or bring their own architecture file and not rely on the heap at all with Dsp::new().

From your comment in https://github.com/grame-cncm/faust/pull/803, it seems the Release code works, albeit, it might hit the same problem for allocations that are larger (https://github.com/rust-lang/rust/issues/89908 it is mentioned that stack allocation larger than 2mb are bound to fail).

I can't manage to make the original code fail at all in Release, even with bigger array. Though maybe my test get optimized away at some point

bluenote10 commented 2 years ago

Faust itself should always output the same thing that includes the opt-in feature flag in the code. I've pushed my playground here to show what I mean. In this project: default-boxed is an optional dependency, and you can run the project with either cargo run or cargo run --features default-boxed.

Ah now I see, thanks for the demo. I initially thought you mean generating/omitting the #[derive(default_boxed::DefaultBoxed)] entirely depending on a faust command line argument or so. But you're right, enabling/disabling it via a Cargo feature avoids any switch at generation time, and transfers it towards Rust compilation. Users could still enable it by default by setting a corresponding default feature. I guess the only thing that isn't possible is to enable or disable it unconditionally, i.e., not exposing the feature publicly (there is no such thing as private crate features, right?). But that is not super important, so I'd say that looks pretty good.