ogxd / gxhash

The fastest hashing algorithm 📈
https://docs.rs/gxhash
MIT License
696 stars 23 forks source link

need specific instructions when use as a library #86

Closed jianshu93 closed 1 month ago

jianshu93 commented 1 month ago

Dear gxhash team,

I was using it for some MinHash algorithms (https://github.com/jianshu93/hyperminhash-rs/blob/354358e6703774143d6a5a9fc19c61329aa6c457/src/lib.rs#L165) to estimate set similarity:

I have the following compiling error:

error: Gxhash requires aes and sse2 intrinsics. Make sure the processor supports it and build with RUSTFLAGS="-C target-cpu=native" or RUSTFLAGS="-C target-feature=+aes,+sse2". --> /storage/home/hcoda1/4/jzhao399/.cargo/registry/src/index.crates.io-6f17d22bba15001f/gxhash-3.4.1/src/gxhash/platform/x86.rs:2:1 | 2 | compile_error!{"Gxhash requires aes and sse2 intrinsics. Make sure the processor supports it and build with RUSTFLAGS=\"-C target-cpu=native\" or RUSTFLAGS=\"-C target-feature=+aes,+sse2\"."} | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ I have to use RUSTFLAGS="-C target-feature=+aes,+sse2" when compiling it and it worked. However, I do have both SSE and AES on my platform. Building the library itself, I do not have this problem. Any idea how I can avoid this when using as a library since most modern CPUs should have those 2 features. FYI, no problem when using as library in ARM (Mac M series chips).

Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb cat_l3 cdp_l3 invpcid_single intel_ppin intel_pt ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts hwp hwp_act_window hwp_epp hwp_pkg_req pku ospke avx512_vnni md_clear spec_ctrl intel_stibp flush_l1d arch_capabilities

Thanks,

Jianshu

ogxd commented 1 month ago

Hello @jianshu93 !

I am not sure I understand, do you mean that the documentation is unclear about the CPU feature requirements? Or are you having trouble building your project when using gxhash? The CPU flags you shared contains both aes and sse2 so it should work fine.

jianshu93 commented 1 month ago

hi @ogxd ,

I am not sure when to use RUSTFLAGS="-C target-cpu=native" when compiling it as a dependence, without it, it works on MacOS, but not on Linux (RHEL). No idea why, exactly the same thing but just because the system/instruction is different. It seems rustc on Linux did not detect whether AES or SSE2 is supported.

Thanks,

Jianshu

ogxd commented 1 month ago

Oh yes it seems rustc does not detect correctly CPU features for some CPU models. Given some past issues I found, it seems not so uncommon, and likely related to LLVM. Some examples 1 2. So I am guessing your Linux machine uses a very recent CPU whose feature set is not yet known by LLVM (or at least the version you're using, maybe you can try update rustc).

As you found out, instead of -C target-cpu=native you can directly use -C target-cpu=+aes,+sse2 for x84 or -C target-cpu=+aes,+neon for ARM, although it's a little less convenient.

You can try adding this same file to your project to automatically build with -C target-cpu=native. It must be placed in .cargo/config.toml.

Maybe you can even add to it a specific RUSTFLAGS depending on the platform (I haven't tried it)

[target.x86_64-unknown-linux-gnu]
rustflags = ["-C", "target-cpu=+aes,+sse2"]
jianshu93 commented 1 month ago

Thanks! this is helpful. Just so weird that this is CPU dependent (I was thinking should be the same no matter what the system is).

Jianshu

ogxd commented 1 month ago

Yes it is quite annoying to have to rely on this. There is an initiative for portable SIMD in Rust, which would allow this crate to be used in any condition without having to specify any RUSTFLAGS, but we're not there yet. I have made an implementation in C# which has portable SIMD wrappers and it can be used in a portable manner, unlike this Rust crate.

jianshu93 commented 1 month ago

I think ahash also use the portable SIMD for AES but not for SSE (before it was a different name, now this new name since Rust nightly 1.78). That might be helpful?

Thanks,

Jianshu