Tracking issue for RFC 2045: improving `#[target_feature]`

alexcrichton commented 6 years ago

This is a tracking issue for the #[target_feature] segment of RFC 2045 (https://github.com/rust-lang/rfcs/pull/2045). #[cfg(target_feature)] was tracked in #29717 and has since been stabilized.

Steps

[x] Implement the proposed #[target_feature] semantics
[x] Document these semantics: https://github.com/rust-lang/reference/pull/545
[ ] Stabilize
- x86_64 & i686
  - [x] The basic set (sse–sse4.2, avx, avx2, ...) (#49664)
  - [x] adx: https://github.com/rust-lang/rust/pull/93745
  - [ ] avx512
  - [x] ~~mmx~~ (never going to be stabilized)
  - [ ] sse4a
  - [ ] tbm
  - [ ] lahfsahf
  - [ ] prfchw
- [ ] arm
- [ ] aarch64
- [ ] hexagon
- [ ] powerpc
- [ ] mips

@gnzlbg have anything else you want filled out here?

(below added from comments on PR)

[ ] consensus on the API for run-time feature detection
[ ] should cfg!(feature) work across #[inline(always)] functions, generics, etc?

And some related tasks:

[x] API breaking change: allow #[target_feature] on unsafe functions only
[x] API breaking change: #[target_feature = "+feature"] => #[target_feature(enable = "feature")]
[ ] fix bug: https://github.com/rust-lang/rust/issues/42515
[x] resolve bug: https://github.com/rust-lang/rust/issues/44367

gnzlbg commented 6 years ago

@alexchrichton

Yes, we should also clear some of the unresolved questions:

[ ] consensus on the API for run-time feature detection
[ ] should cfg!(feature) work across #[inline(always)] functions, generics, etc?

And some related tasks:

[ ] API breaking change: allow #[target_feature] on unsafe functions only
[ ] API breaking change: #[target_feature = "+feature"] => #[target_feature(enable = "feature")]
[ ] fix bug: https://github.com/rust-lang/rust/issues/42515
[ ] resolve bug: https://github.com/rust-lang/rust/issues/44367

It would be nice if those API breaking changes could be prioritized.

retep998 commented 6 years ago

Rust already will sometimes not inline a #[inline(always)] function (notably recursive situations), so unsafe code already can't assume that #[inline(always)] code will always be inlined.

gnzlbg commented 6 years ago

@retep998 cfg!(feature) does not require any unsafe code.

retep998 commented 6 years ago

@gnzlbg I meant in regards to where the RFC states that #[target_feature] and #[inline(always)] mixed together should result in an error. We can simply not inline functions in that case, even if they are marked as #[inline(always)], because Rust will already sometimes not inline functions marked #[inline(always)].

gnzlbg commented 6 years ago

@retep998 We could definitely relax this by just not inlining an #[inline(always)] function. I don't know if we want to do so: the user did not write inline, but inline(always), we should at least warn here.

What we cannot currently do is inlining a function across mismatching features (independently of its inlining annotations). The alternative sections of the RFC explores this a bit though.

newpavlov commented 6 years ago

Any news on implementation status of --enable-features="feature0,feature1,..." proposed in the RFC?

gnzlbg commented 6 years ago

@newpavlov this is being discussed in ~~issue 49658~~ . (note from pnkfelix: did this mean to say #49653 ?)

alexcrichton commented 6 years ago

https://github.com/rust-lang/rust/issues/48556 has entered FCP which I believe will implicitly stabilize this attribute

gnzlbg commented 6 years ago

So after the inline-always fix it looks like the only bug open still being tracked here is https://github.com/rust-lang/rust/issues/42515 .

gnzlbg commented 6 years ago

That bug looks hard to solve and probably won't make it before stabilization.

The following issues look "easy" to fix low-hanging fruit. It might be worth to fix these before stabilization:

kennytm commented 6 years ago

I'm reopening this as tracking issue for the non-x86 #[target_feature]s.

jethrogb commented 5 years ago

Since the word "feature" is quite overloaded I'll try to use the words "Rust feature" and "target feature" to disambiguate.

I think there are some big issues with the x86 Rust feature gating as implemented right now. The following unstable Rust features were added after the main x86 target feature stuff was stabilized:

avx512_target_feature
mmx_target_feature
sse4a_target_feature
tbm_target_feature
adx_target_feature
cmpxchg16b_target_feature
movbe_target_feature

However, the Rust feature gate only affects the target_feature(enable) and cfg(target_feature) attributes. It does not cover the matchers of the is_x86_feature_detected macro, and it does not (always) cover the corresponding intrinsics either. For example, this compiles on stable:

use std::arch::x86_64::*;

fn main() {
    if (is_x86_feature_detected!("sse4a")) {
        println!("{:?}", unsafe { _mm_extract_si64(_mm_setzero_si128(), _mm_setzero_si128()) });
    }
}

Here's what RFC 2325 says about this:

It's not expected that the contents of std::arch will remain static for all time. ... Rather once [new intrinsics are] implemented they'll be stabilized and included in libstd following the Rust release model.

It doesn't say so explicitly, but I assume the stdsimd Rust features used to "follow the Rust release model" should match the Rust features for the target features used. Or at the very least, these Rust features should be stabilized at the same time.

... is_target_feature_detected!. The macro will accept one argument, a string literal. The string can be any feature passed to #[target_feature(enable = ...)] for the platform you're compiling for.

It's not stated explicitly, but I assume this is meant to only enable those strings that you can pass to target_feature(enable) with your current compiler.

How can we clean up the current mess?

gnzlbg commented 5 years ago

It does not cover the matchers of the is_x86_feature_detected macro,

I think we could fix this in the is_x86_feature_detected macro. I don't think we can just cfg out the match arms but we could simply cfg in some compile_error! if these features are not enabled when the match arms are used. There might be a better way though.

and it does not (always) cover the corresponding intrinsics either.

That's because those intrinsics are stabilized: https://github.com/rust-lang-nursery/stdsimd/blob/master/crates/core_arch/src/x86/sse4a.rs#L56

I don't know if this stabilization was accidental or not (It looks to me to be accidental since those feature gates were created to not stabilize these features).

How can we clean up the current mess?

For the two points mentioned above, PRs to stdsimd should suffice.

For the issue about #[target_feature(enable = "...")] accepting unstable features, we will need some changes to rust-lang/rust.

jethrogb commented 5 years ago

For the two points mentioned above, PRs to stdsimd should suffice.

I didn't do a full audit of all target features/intrinsics that should or shouldn't be Rust feature gated.

gnzlbg commented 5 years ago

Basically, all the features here https://github.com/rust-lang/rust/blob/f40aaa68278ef0879af5fe7ce077c64c6515ea05/src/libsyntax/feature_gate.rs#L384 should be unstable. That is, using these in #[target_feature(enable = "...")], #[cfg(target_feature = "...")], and using the core::arch:: intrinsics that correspond to them should be unstable and require enabling the appropriate feature.

Stabilizing any of these would require at least a mini-FCP.

jethrogb commented 5 years ago

That's the easy part of the audit. We know that part works right.

gnzlbg commented 5 years ago

That's the easy part of the audit. We know that part works right.

I know it worked (so did everything back then AFAICT), but I don't know if everything is still working right now (as you point out, some things aren't working).

I am going to add compile fail tests to stdsimd to make sure that things don't break again.

jethrogb commented 5 years ago

This feature seems too inconvenient to use right now. Having to mark everything unsafe is not great, but marking individual functions is also not very useful when writing generic code. Consider the following code, in which I have a larger algorithm that ties together two computational kernels:

trait Kernel1 {
    ...
}

trait Kernel2 {
    ...
}

fn compute<K1: Kernel1, K2: Kernel2>(...) {
    ...
}

I have several hand-optimized versions of either kernel using SIMD intrinsics for various feature sets. I have a wrapper function for compute that uses feature detection to figure out which implementation of the algorithm to call. This doesn't work very well however. I'm seeing a 35% reduction in performance when doing this, compared to a compiling a particular implementation setting the features via RUSTFLAGS. When monomorphizing compute, what I want is that the compiler computes the union of the features needed for K1 and K2 and uses that in the codegen for compute and all functions called from compute.

gnzlbg commented 5 years ago

There are several pull requests To the RFCs repository that improve this. You might to want to chime in on those.

Some of them do help with some of the problems that you are having (eg the target feature 1.1 RFC). Adding support for trait methods could be a future extension over that, but the reason trait methods in our case is slow is because LLVM is missing basic target feature propagation optimizations. There is an LLVM bug open over that. I suppose that writing our own custom LLVM pass that does this shouldn’t be too hard.

jethrogb commented 5 years ago

Thanks @gnzlbg! Do you have a link to those RFCs? Also in my case I'm only using associated functions, not methods.

gnzlbg commented 5 years ago

@jethrogb there is a PR open that fixes the accidental stabilization: #64534 .

gnzlbg commented 4 years ago

@jonas-schievink this should probably be marked with A-stability as well, since an accidental stabilization was reported here.

joshuawarner32 commented 4 years ago

With the impending release of Apple Silicon macs, it could be useful to start stabilizing the basic aarch64 target_feature support (in particular, I care about neon). This'll help crates provide high-quality simd implementations on this platform. Anything I can do to help in this effort?

workingjubilee commented 3 years ago

@joshuawarner32 You can help by adding more NEON intrinsics to https://github.com/rust-lang/stdarch if you want extensive aarch64 NEON support. It does not cover that many intrinsics, and we'd want more testing of more NEON intrinsics.

bradjc commented 3 years ago

Are there more boxes that should be checked for this tracking issue? For example, I'm able to use the arm target features on nightly without a feature.

ehuss commented 3 years ago

@bradjc Which ARM features are you able to use? They should all be unstable.

mattico commented 3 years ago

I was building with --target=thumbv7em-none-eabihf -Ctarget-feature=+armv7e-m,+m7,+v7clrex,+fp-armv8d16,+hwdiv,+fp64,+fpregs,+fpregs64,+noarm,+db,+thumb2 and then checking literally every arm target-feature using cfg!() and #[cfg()] without enabling arm_target_feature.

ehuss commented 3 years ago

@mattico This issue is about stabilizing the #[target_feature] attribute. The -C flag has been stable since 1.0.

mattico commented 3 years ago

@ehuss Ah, yes. Was confused because the same whitelist is used for both purposes.

bradjc commented 3 years ago

We are using at least v7 and thumb-mode.

https://github.com/tock/tock/blob/9077486ea575fa234ebc675a256958a7693413a1/arch/cortex-m/src/lib.rs#L214-L227

workingjubilee commented 3 years ago

They have similar nomenclature and use cases, but #[cfg(target_feature)] is not #[target_feature].

hudson-ayers commented 3 years ago

Perhaps the issue description should be updated to note that this only tracks a portion of https://github.com/rust-lang/rfcs/pull/2045 ? Because that RFC seems to specify cfg_target_feature in addition to target_feature, and this issue description does not make it clear that cfg_target_feature was tracked elsewhere and has since been stabilized (https://github.com/rust-lang/rust/issues/29717).

From looking at the checked boxes in this issue description, I assumed that nothing in RFC 2045 had been stabilized for non-x86 architectures.

dylanede commented 3 years ago

Am I right in saying that cfg!(target_feature = "...") doesn't currently match the description from the RFC? I.e. it doesn't depend on the current context, only on the build arguments. This means that it cannot be used to selectively enable code in a function based on the context that the function is inlined into.

According to the RFC, this code should panic, but currently it does not:

playground link

fn main() {
    if is_x86_feature_detected!("avx2")
    {
        println!("Has AVX2");
        unsafe { foo() };
        println!("Didn't panic!");
    }
}

#[target_feature(enable = "avx2")]
unsafe fn foo() {
    if cfg!(target_feature = "avx2") // this should always be true
    {
        panic!();
    }
}

ehuss commented 3 years ago

@dylanede Correct, cfg_target_feature is not context-aware as described in the RFC. I believe #42515 is tracking that, and it was intentionally stabilized without that being implemented (though there wasn't mention of the cfg! macro specifically).

tarcieri commented 2 years ago

FYI, opened a PR where I tried to pick up @gnzlbg's work from #60109 stabilizing the ADX target feature:

https://github.com/rust-lang/rust/pull/93745

pnkfelix commented 2 years ago

The lang-team looked at this issue during its backlog bonanza on 2022-06-09

There are a couple of problems here.

First, from my own attempts to quickly review the status here, It is not immediately clear which items that are listed on this issue's description represent cross-cutting concerns for all (or nearly all) uses of #[cfg(target_feature="...")] and/or #[target_feature="...")], and which are "just" to-do items of potential CPU features to support.

One thing that did become clear during my review is that the checklist in the description is not really being maintained. (E.g. an early checkbox item, "add documentation", was left unchecked; I determined that documentation had been added and checked it off myself. As another example: there were suggested checkbox items that seemed like they should have been transcribed to the description, but instead were left on the comment thread.)

A big checklist that is not been maintained (i.e. the open items may or may not be actually done) is not great.

Furthermore: The T-lang team thinks its is not great to block resolving this issue on every potential CPU feature that we could support.

At the time of that backlog bonanza meeting:

the T-lang understanding was that the main question remaining here was whether the CLI flag --enable-feature, documented in RFC 2045, was still desireable. (Note the RFC discusses how this flag differs from the already present (and defacto stable) -C target-feature=... CLI flag.
the thinking was that we could/should fcp close this issue, noting that any new features slated for stabilization should be in new issues.

However, since reviewing the comment thread on this issue, I'm not convinced that its ready for an fcp close. Or at least, I was not able to determine the current status on a number of (unlinked, potentially unfiled?) issues that are listed for it, and it would take a bit more effort for me to build experimental code to establish the status on those issues.

Perhaps most worrisome here: I am not sure that the owners who were most invested in driving this feature forward are currently in a state where they are going to continue doing so.

In other words: I think this issue is in need of an owner.

(I would be happy to be proven wrong here. If it is true that there is no longer interest in making a --target-feature flag, at least in the short term, and that there are no longer cross-cutting issues, then perhaps the only remaining work is truly the checklist of target-specific features. if that is true, then I would suggest that we fcp close this issue, and open up specific issues for each such target-specific feature. Its too much work to ask a single owner to drive progress on each such thing.)

luojia65 commented 2 years ago

Hello! Please add stabilization state of RISC-V architecture in issue details as well :) It only has x86_64, arm etc.

joshtriplett commented 2 years ago

We discussed this in a @rust-lang/lang backlog review today.

We feel like this tracking issue tracks two different things:

Target-specific features, of which there's a never-ending stream as CPUs add new features. This isn't something we should have an overall tracking issue for, just tracking issues for adding and stabilizing new features.
Enhancing target_feature itself (attributes, command-line options, etc) to make it easier to use, safer to use, usable in more places, etc. This is something we'd love to see, but it needs an owner.

nagisa commented 2 years ago

API breaking change: allow #[target_feature] on unsafe functions only API breaking change: #[target_feature = "+feature"] => #[target_feature(enable = "feature")]

Both of these are true today, except for the first on WASM platforms, where #[target_feature] was made applicable to safe functions knowingly.

bkolligs commented 1 year ago

Any update on AVX 512 support?

qwandor commented 1 year ago

Hi, is there anything blocking stabilising arm_target_feature?

coastalwhite commented 1 year ago

Hey! I am currently implementing RISC-V Zk intrinsics, and I am running into a pattern that is currently not well covered by this PR.

Either Zkne and Zknd extension needs to be enabled to get access to an intrinsic. I imagine for the compiler it is enough to write:

#[target_feature(enable = "zkne")]
pub unsafe fn aes64ks1i(...) -> ... {
     ...
}

But this produces wrong cargo doc output and can be misleading. Namely, it states that zkne needs to be enabled.

#[target_feature(enable = "zkne", enable = "zknd")]
// OR
#[target_feature(enable = "zkne")]
#[target_feature(enable = "zknd")]
pub unsafe fn aes64ks1i(...) -> ... {
     ...
}

Produces zkne AND zknd.

#[target_feature(enable = "zkne,zknd")]
pub unsafe fn aes64ks1i(...) -> ... {
     ...
}

Produces zkne,zknd which does not exist.

I feel like this is only a documentation issue, but still an issue.

majaha commented 1 year ago

I'm not sure if this is the right place for this, but the wasm feature #[target_feature(enable = "multivalue")] is broken at the moment, as it leaks into other untagged functions: Godbolt This is essentially for the same reasons I've outline here: https://github.com/rust-lang/rust/issues/83788#issuecomment-1685582270 There may be other target features with similar non-local behaviour, I haven't tested them all.

jhorstmann commented 10 months ago

I tried reading through the open issues regarding target features, but could not find a more fitting one for this question. I have the following code, which currently returns different values on stable vs nightly when compiled with -Ctarget-cpu=skx (Godbolt). I would have expected that target-cpu implicitly enables that feature, or a warning/error that the avx512 target features are still unstable.

pub fn preferred_vec_size() -> usize {
    if cfg!(all(target_arch = "x86_64", target_feature = "avx512f")) {
        64
    } else if cfg!(all(target_arch = "x86_64", target_feature="avx")) {
        32
    } else {
        16
    }
}

workingjubilee commented 10 months ago

That's... huh. I guess the first branch is being ignored as an invalid feature gate on stable.

tarcieri commented 8 months ago

Are there any particular stabilization blockers for feature(avx512_target_feature), or does it just need a stabilization PR?

Edit: response here

rust-lang / rust

Tracking issue for RFC 2045: improving `#[target_feature]` #44839