raphlinus / font-rs

Apache License 2.0
753 stars 49 forks source link

Runtime detection of SSE capabilities #29

Open raphlinus opened 5 years ago

raphlinus commented 5 years ago

Stable Rust now has an is_x86_feature_detected macro, which should be used to switch between SSE and fallback implementations based on runtime detection of the SSE capability.

6D65 commented 5 years ago

Did you mean something like this

pub fn accumulate(src: &[f32]) -> Vec<u8> {
    if is_x86_feature_detected!("sse") {
        unsafe { accumulate_sse(src) }
    } else {
        let mut acc = 0.0;
        src.iter()
            .map(|c| {
                // This would translate really well to SIMD
                acc += c;
                let y = acc.abs();
                let y = if y < 1.0 { y } else { 1.0 };
                (255.0 * y) as u8
            }).collect()
    }
}
raphlinus commented 5 years ago

Yes, very much like that. The comment can probably be adapted though :)

codri commented 5 years ago

The comments can be left as bookmarks, to track the code copying patterns :D Can make a pull request tomorrow.

Also, the overhead of one branch instruction(for feature detection) as well as the zeroing of the result Vector(added by me in the accumulate_sse), keep bugging me.

Maybe something can be done with the vector zeroing. Not sure building for one specific CPU feature is a good alternative to a runtime feature detection.

Both of them, most likely insignificant and not worth the time, unless profiling tells otherwise.