Closed jasondavies closed 6 years ago
I can mentor you. If it's your first contribution it is easier if you report "regularly" or just open a PR so that we can track your progress and help if you get stuck. This is, in a nutshell, the list of all steps necessary. It looks like a lot of steps, but these steps are very very small. In your case, the step number is a bit larger because you want to add intrinsics for a new feature
that we haven't white-listed yet.
cpuid
instruction. This page tells you everything: https://en.wikipedia.org/wiki/CPUIDllc --mcpu=help
gives you a list of all supported cpus by your LLVM version. Choose one cpu that has the feature and llc --mcpu=<cpu_here> --mattr=help
will tell you all the backend features that the CPU supports. Find the name of the one you care about.stdsimd
: to use intrinsics that require the feature, the CPU must support it, so in this step you need to send a PR to stdsimd
that includes the code that detects whether the feature is enabled or not, and if its enabled, activates the feature. This is a two-liner change.
detect
module, for x86
that would be here: https://github.com/rust-lang-nursery/stdsimd/blob/master/stdsimd/arch/detect/arch/x86.rs#L28stdsimd
with this step.coresimd/x86
module. If the feature is its own self-standing feature, you probably want to add a new module there called <featurename>.rs
(check how rdtsc does it: https://github.com/rust-lang-nursery/stdsimd/blob/master/coresimd/x86/rdtsc.rs)extern
and use link_llvm_intrinsic
to give it the LLVM function name. Like here: https://github.com/rust-lang-nursery/stdsimd/blob/master/coresimd/x86/rdtsc.rs#L48 . Finding the names of the llvm functions that you want to call can be tricky. A good approach is to search for the C intrinsic in https://github.com/llvm-mirror/clang/tree/master/lib/Headers to see how clang implements them. If it uses a builtin, you can search for the code-gen tests that make sure that it generates the right call to the corresponding LLVM function in clang and/or LLVM source files. For example, I just searched in the clang repo for rdtsc and landed here: https://github.com/llvm-mirror/clang/blob/a38ba9770a70056f9fdd6c71f819e5db45a105e4/test/CodeGen/rd-builtins.c#L15 The llvm.x86....
is the name of the intrinsic one is looking for.#[inline]
#[target_feature(enable = "featurename")]
assert_instr(<instruction_name>)
procedural macro does this for you. The #[cfg_attr(test, ...)] just makes sure that these tests are only generated when compiling the tests of the library. #[simd_test = "<feature_name>"]
that makes sures that the test only runs if the CPU where the library is being tested actually supports the feature. stdsimd
for this. Test might fail for a variety of reasons, and we have seen many of them. So if you get stuck here just go directly to step 5 and we'll take a look at it. Also, if you want to add newer intrinsics for a particular feature, you don't need Step 1 and 2, and can go directly to Steps 3,4,5.
Oh if you'd like I got started on these awhile ago but never got around to adding tests and landing them, so feel free to use that as a starting point!
Wonderful. Thank you!
sha
feature is already white-listed in rustc? Should save a bit of time if so!Step 1: unless I'm missing something, it seems the sha feature is already white-listed in rustc? Should save a bit of time if so!
Yes, sha
is already white-listed. If that's everything you need then you can skip step 1 and go straight to step 2 and take it where @alexcrichton left it.
Also, the AArch64
crypto
feature is white-listed and it stands for sha1
, sha2
, aes
, and pmull
: https://github.com/rust-lang-nursery/stdsimd/blob/master/stdsimd/arch/detect/arch/aarch64.rs#L91 , so if that's all you need you can go directly to step 3 for aarch64
as well. AArch64's also has sha3
and sha512
features but those aren't white-listed yet (also I don't know whether LLVM supports these yet; it should though).
I'll look at adding aarch64
if I can get hold of a test machine. Will probably be a separate PR.
An alternative would be to get hold of qemu-aarch64 or use our CI via travis to implement these.
I got hold of an ARMv8 server (with 96 cores!) but I'm running into some strange errors:
error:
is_x86_feature_detected can only be used on x86 and x86_64 targets.
You can prevent it from being used in other architectures by
guarding it behind a cfg(target_arch) as follows:
#[cfg(any(target_arch = "x86", target_arch = "x86_64"))] {
if is_x86_feature_detected(...) { ... }
}
--> crates/coresimd/src/../../../coresimd/aarch64/neon.rs:725:5
|
725 | #[simd_test = "neon"]
| ^^^^^^^^^^^^^^^^^^^^^
|
= note: this error originates in a macro outside of the current crate (in Nightly builds, run with -Z external-macro-backtrace for more info)
Any idea what's going on here? Tested on current master (56937d25fb6e4f3fc14c9f0e29123050cac96aca) with rustc 1.26.0-nightly (75af15ee6 2018-03-20)
.
Forgot to say: there are 67 of these errors generated when running cargo test
. This is on Linux (Ubuntu 16.04 LTS).
The system is a dual-socket Cavium ThunderX SoC.
@jasondavies if you set a TARGET
environment variable to the target you're compiling for, does that fix the issue?
Yes, it does!
Ah ok, I think the bug is here --
I think Cargo doesn't actually set that for crate compilations, so we should fix that!
Ok I think I've fixed that in https://github.com/rust-lang-nursery/stdsimd/pull/397 and otherwise this should be fixed, so closing.
Ah, I see. So my opinion on this and #397 is that somebody checking out stdsimd
and running ci/run.sh
or maybe even just cargo test
should get a "nice" out-of-the-box experience. At least with respect to being able to run the test on the platform they are developing on. I think this is important to get good bug reports.
So the code that #397 fixes actually tried to come up with sensible defaults for linux, macosx, and windows, but it might well be that reasonable defaults do not really exist. If that's the case erroring early sounds like a good plan to me.
What needs to be done to support Intel's SHA extensions?
Happy to help if someone is willing to mentor/guide me!
I would also note that there are equivalent/similar extensions for ARM and POWER8. But getting support for the Intel extensions would be a great start; they are already supported on AMD Ryzen.