BurntSushi / rust-snappy

Snappy compression implemented in Rust (including the Snappy frame format).
BSD 3-Clause "New" or "Revised" License
444 stars 43 forks source link

Use std::arch for crc checking if available #19

Closed KodrAus closed 4 years ago

KodrAus commented 6 years ago

There's a comment in the source already but I thought I'd open an issue for this too, now that x86 intrinsics will be available to us in stable Rust 1.27. I've got a crc function I've been using that looks something like:

fn crc32c(buf: &[u8]) -> u32 {
    #[cfg(target_arch = "x86_64")]
    {
        if is_x86_feature_detected!("sse4.2") {
            return unsafe {
                crc32c_sse42(buf)
            }
        }
    }

    crc32c_slice8(buf)
}

#[cfg(target_arch = "x86_64")]
#[target_feature(enable = "sse4.2")]
unsafe fn crc32c_sse42(mut buf: &[u8]) -> u32 {
    use std::arch::x86_64::*;

    let mut crc: u32 = !0;

    while buf.len() >= 4 {
        let b = LE::read_u32(&buf[0..4]);
        crc = _mm_crc32_u32(crc, b);

        buf = &buf[4..];
    }

    for &b in buf {
        crc = _mm_crc32_u8(crc, b);
    }

    !crc
}

With SSE4.2 support that's approximately, roughly somewhere in the ballpark of much faster than our fallback implementation.

I'm not so sure yet how we'd want to approach the target_feature stuff for a library, I'm guessing the static check for SSE4.2 support would be better than a runtime check? I also excluded 32bit because I didn't need it.

BurntSushi commented 6 years ago

Thanks for the reminder! I'd prefer to hold off for at least a few releases after 1.27. In general, I think it's bad form for the ecosystem to immediately adopt new stable features, especially in libraries, because it requires that everyone else bump their minimum Rust version.

We could add this behind a feature or otherwise do version sniffing in a build.rs to enable use of new features. That's what I plan on doing for regex. I tend to think it's more trouble than it's worth for this crate though...

I'm not so sure yet how we'd want to approach the target_feature stuff for a library, I'm guessing the static check for SSE4.2 support would be better than a runtime check?

IMO, we should always be aiming for runtime detection unless there is a specific reason why that is unsuitable. Runtime detection has the wonderful benefit of being completely transparent. Compile time checks require compilers of the software to build non-portable binaries with special flags. Runtime detection "just works."

I don't think there should be any problem with runtime detection here. (The chief problem with runtime detection is amortizing the CPU check. Typically, the SIMD work will dwarf it anyway, but there could be cases where it is a bit unnatural.)

JuanPotato commented 6 years ago

Related to this, are there plans to move the crc code to its own library for others to use since the current one is relatively slower than the one in snappy.

BurntSushi commented 6 years ago

There are no specific plans on my part. For the most part, I didn't want to add yet another crc crate, but I also don't have the bandwidth to collaborate with others on improving an existing crate. Others are certainly welcome to do it though!