Closed adamgemmell closed 2 years ago
One possible issue I've noticed is that I can't see a way to view https://doc.rust-lang.org/std/macro.is_aarch64_feature_detected.html under a different platform like you can with other crates on docs.rs. Until we can I'm not sure it's possible to make the documentation for the macro easily visible.
The documentation on doc.rust-lang.org is for all platforms at the same time.
The issue is there's different output for multiple different platforms. I'll try removing the doc comment on the error macro in there to see if that helps. It's not useful on the docs page anyway and should just be a normal comment
Is the general ARM one different enough to not just stabilize it at the same time as well?
@rfcbot merge
@rfcbot merge
Team member @Amanieu has proposed to merge this. The next step is review by the rest of the tagged team members:
Concerns:
pauth
or not to split pauth
Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!
See this document for info about what commands tagged team members can give me.
Have the unresolved questions here been addressed?
Also, are we stabilizing any corresponding #[target_feature(enable = "...")]
values for aarch64?
Is the general ARM one different enough to not just stabilize it at the same time as well?
One issue is that we can't remove the crypto
feature for ARM until this commit makes it to Rust's minimum LLVM version.
I don't see a problem with stabilising the other ARM features though (this seems to be an option based on how the macro is written). Just that no-one has asked for it yet.
I'm not sure how well it works on FreeBSD, nor if there's any more ARMv8 features that affect aarch32 execution (though neither of these arguably block stabilisation).
Have the unresolved questions here been addressed?
Amanieu seems happy with simd_aarch64
and I can't see any issues with it.
Personally I'm happy keeping PAuth as one feature as LLVM does as it would then match target_features.
Also, are we stabilizing any corresponding
#[target_feature(enable = "...")]
values for aarch64?
These should be mostly up to date. https://github.com/rust-lang/rust/blob/c7a30c8b6860d1f3459086f7a91074db1b54bc37/compiler/rustc_codegen_ssa/src/target_features.rs#L40 Since the minimum LLVM version has moved up quickly I reckon a few could be uncommented - such as PAuth
To clarify, we are only stabilizing feature detection, not the stdarch intrinsics or #[target_feature(enable = "..")]
. Although now that you mention it, we should probably stabilize aarch64_target_feature
as well for #[target_feature(enable = "..")]
since it uses the same feature names.
@rfcbot concern move to a module?
Should we move the macro out of the root of std
? Now it's just std::is_aarch64_feature_detected
. Since nowadays we have the ability to put macros into modules, it might be good to not clutter the root of the crate with very specific macros like this.
The (stable) x86/x86_64 variant is in the crate root too. Moving only the aarch64 variant would be confusing.
After some internal discussion my opinion on splitting PAuth
has changed a little - apparently there have been attempts to enable partial userspace support of the feature in the past and it's reasonably possible that it could happen in the future, in which case we would want separate paca
and pacg
options. These could of course be added in the future, though we'll end up with those in addition to pauth
, slightly confusingly.
Would it make sense to split them preemptively for runtime feature detection? Or would the above scenario be ok?
I would rather preemptively split the features. We also need to figure out how the two features map to LLVM's pointer authentication support.
PAuth demonstrates some rather interesting properties that could apply to other features too, so it's worth examining, and perhaps using this as a model for other cases that arise. Notably:
A likely example application could involve the compiler using paca
to implement CFI, whilst user code explicitly uses pacg
to do something else.
So, as a general policy, at least for runtime detection, I would argue for splitting features down to the point that they're unlikely to be split further in the future, as we've done here. A useful guide for is "unlikely to be split further" is the hardware ID registers, and those are what we used for VIXL. Most Linux CPU features map 1:1 onto minimum values of fields in those registers.
The compile-time detection — not strictly this issue but related — is made complicated by interactions with LLVM. As a user, I would prefer those to be split in the same way, but I don't fully understand the interactions with LLVM, so it may not be possible. I know that Rust has a mapping function so we can deal with renamings, but dealing with combinations may be trickier.
I agree that we should be consistent in locating this; I don't think it does any harm in the root, since it has a sufficiently specific name that it won't conflict with anything.
I checked LLVM's code: the pauth
feature only controls whether the assembly instructions are available (and whether LLVM's codegen is allowed to use emit). CFI is controlled by a different flag.
I agree that we should try to have fine-grained features where possible. The main issue with exposing finer-grained features than LLVM is that there may not be a precise way to map #[target_feature(enable = "paca")]
to LLVM. Should we just enable the pauth
feature entirely? Should we give an error and require the user to enable both paca
and pacg
for the LLVM pauth
feature to be enabled?
For any given situation where a codegen backend offers what we consider "two" features as one feature, either enabling both is what one might consider a reasonable interpretation or it is not.
An example is if we decide to, say, separate the architectural state added by certain ISA features from the instructions that are enabled by those, such as with #[target_feature(enable = "+sse4.2, -xmm")]
. This hypothetical set of flags would enable the CRC32 instruction to be used, and LLVM recently bifurcated its features to support such a choice. It would not be reasonable for a codegen backend, to which we are attempting to deliver the request "do not use this architectural state", to emit an SSE4.2 instruction that did require the xmm registers, as that breaks the requested contract.
This is slightly strained, obviously, as just allowing #[target_feature(enable = "+crc32")]
might be simpler for that case.
However, if in practice two features are sometimes known under one name because they usually appear together, it gets a fair bit murkier. I think, for instance, it's fine for the above "+sse4.2" to also imply "+xmm" or "+crc32"... not to mention that not doing so would probably be backwards incompatible, yanno... again, it's a slightly strained example.
That's my thoughts anyways. I think if there's any uncertainty we should error, as generally it's permissible for a compiler to increase the amount of sound code it emits, and thus later choose to allow compilations that we would have previously erred on, but reducing it tends to receive more complaints.
I think it's also fair to simply pose the question to LLVM of how they feel about the matter, and if they have any substantial policy here or are just kinda winging it.
Also, are we stabilizing any corresponding
#[target_feature(enable = "...")]
values for aarch64?These should be mostly up to date.
https://github.com/rust-lang/rust/blob/c7a30c8b6860d1f3459086f7a91074db1b54bc37/compiler/rustc_codegen_ssa/src/target_features.rs#L40 Since the minimum LLVM version has moved up quickly I reckon a few could be uncommented - such as PAuth
What is the user-visible distinction between the "fp" and "neon" features for aarch64, as defined by this list?
The (stable) x86/x86_64 variant is in the crate root too. Moving only the aarch64 variant would be confusing.
@m-ou-se does this comment resolve your concern?
I'm going to mark the current discussion as a concern since this doesn't seem resolved currently but otherwise approving the FCP
@rfcbot reviewed
@rfcbot concern to split pauth
or not to split pauth
As for the other unresolved concern, I agree with @bjorn3 and @joshtriplett here that we should prioritize the local consistency of where we expose similar functions over cleanliness of the root namespace.
Regarding pauth
, there is a pull request to split them for feature detection in stdarch (https://github.com/rust-lang/stdarch/pull/1259) but this is still missing support in rustc for cfg(target_feature)
and #[target_feature(enable)]
.
I agree that splitting pauth and allowing just one suboption doesn't make sense as LLVM is. The RFC says that target_feature
will "allow the compiler to generate code under the assumption that this code will only be reached in hosts that support the feature" - we can't silently enable a feature without the user saying it exists.
Should we give an error and require the user to enable both
paca
andpacg
for the LLVMpauth
feature to be enabled?
This looks to be the best option - it's also the least LLVM-specific one, should GCC/Cranelift add support for this in the future. I've been looking into where (within rustc_codegen_llvm ideally) to throw this error from.
What is the user-visible distinction between the "fp" and "neon" features for aarch64, as defined by this list?
Neon implicitly enables FP in LLVM, and (probably among other things) allows use of the Neon SIMD intrinsics recently added to core::arch::arch64
. It's necessary because it's not guaranteed that NEON is available as a user may have disabled it with a custom target.
Right. The FPU for aarch64 is the Neon unit, however, so allow me to rephrase:
Is there any case where it is actually valid to have "-neon, +fp"
or "+neon, -fp"
for AArch64, or is that always erroneous?
fp
and neon
are distinct subsets of the ISA, so it is possible in theory. But in practice, you'll never find an AArch64 chip without both available.
I agree that splitting pauth and allowing just one suboption doesn't make sense as LLVM is. The RFC says that
target_feature
will "allow the compiler to generate code under the assumption that this code will only be reached in hosts that support the feature" - we can't silently enable a feature without the user saying it exists.Should we give an error and require the user to enable both
paca
andpacg
for the LLVMpauth
feature to be enabled?This looks to be the best option - it's also the least LLVM-specific one, should GCC/Cranelift add support for this in the future. I've been looking into where (within rustc_codegen_llvm ideally) to throw this error from.
This likely requires adjusting the canonicalizing logic in to_llvm_feature
to allow both "one to many" and "many to one" relationships:
https://github.com/rust-lang/rust/blob/26a2564666cfdd1c18c98b8312ca1bc47c995073/compiler/rustc_codegen_llvm/src/llvm_util.rs#L155
Which may require adjusting its callsites (mostly within the same file): https://github.com/rust-lang/rust/blob/26a2564666cfdd1c18c98b8312ca1bc47c995073/compiler/rustc_codegen_llvm/src/llvm_util.rs#L194 https://github.com/rust-lang/rust/blob/26a2564666cfdd1c18c98b8312ca1bc47c995073/compiler/rustc_codegen_llvm/src/llvm_util.rs#L253 https://github.com/rust-lang/rust/blob/26a2564666cfdd1c18c98b8312ca1bc47c995073/compiler/rustc_codegen_llvm/src/llvm_util.rs#L396
This is the one that covers #[target_feature(enable)]
usage, I believe?
https://github.com/rust-lang/rust/blob/7c4be43b27993ab405beaa19738258fdd546d3db/compiler/rustc_codegen_llvm/src/attributes.rs#L326
With that adjustment, it may be simplest to just throw the error from within to_llvm_feature
itself.
fp
andneon
are distinct subsets of the ISA, so it is possible in theory. But in practice, you'll never find an AArch64 chip without both available.
Indeed, in practice, most general-purpose software, including the Rust's aarch64-unknown-linux-gnu
, assumes that both FP and NEON are present, and it's required by AAPCS64, which doesn't have any soft-float provision.
The Armv8 architecture does, theoretically, permit them to be unsupported, but it doesn't permit one to be present without the other. There are separate ID register fields for each, but they are required to have the same value. (It also follows that if one supports FP16, so must the other.) In addition, it's pretty hard to determine what it would actually mean to have one but not the other; whilst the reference manual identifies "A64 Advanced SIMD and floating-point instructions", it doesn't attempt to classify them further than that, and there are plenty of instructions that could reasonably belong to either.
Is there any case where it is actually valid to have
"-neon, +fp"
or"+neon, -fp"
for AArch64, or is that always erroneous?
In general: possibly, yes. A user may want to disable just neon
, for example as a debugging tool. This can occur at run-time too, if masked by the OS or tools like Valgrind (or if dynamic feature detection failed for some reason).
In Rust: I expect that it will vary from target to target, but building with either "-neon"
or "-fp
" already deviates from most AArch64 targets in a undefined-behaviour kind of way, doesn't it? That said, that doesn't mean that we shouldn't detect and report them in this API; naively, it looks like it could permit compatibility with bare-metal targets like aarch64-unknown-none-softfloat
.
I'd like to propose a slightly different approach here to unblock the issue:
Could we stabilize is_aarch64_feature_detected
, but not stabilize the pauth
feature yet, and let the discussion about splitting the feature only block stabilization of pauth
rather than all of is_aarch64_feature_detected
?
Could we stabilize is_aarch64_feature_detected, but not stabilize the pauth feature yet, and let the discussion about splitting the feature only block stabilization of pauth rather than all of is_aarch64_feature_detected?
I support this solution to unblock the FCP.
@yaahc @m-ou-se Are your concerns are resolved? Could we unblock the FCP?
The (stable) x86/x86_64 variant is in the crate root too. Moving only the aarch64 variant would be confusing.
What I'd prefer is to move both into a submodule, just like asm!()
, and re-export the x86 one through the prelude/root for backwards compatibility but acknowledge that was a mistake.
All of std's macros were originally in the root, because of #[macro_export]
. But nowadays with the new pub macro
syntax, we have the option of putting them in modules, which we've been using for std::ptr::addr_of
, std::arch::asm
, etc. I don't think we should continue putting macros in the root/prelude by default, even if we have to keep a few older ones there because of historical reasons.
I'd support that if and only if we can actually deprecate the re-export. I think we could deprecate the re-export by writing a deprecated wrapper macro that calls the version in std::arch
.
Otherwise, if we can't deprecate it, I think we should re-export both at the top level.
That said, I also think these names will never conflict with anything, and exporting them will not do any harm, so while I don't object to unblocking the issue by taking that approach, I also would be fine with just exporting these at the top level and have no objection to that approach either.
@yaahc @m-ou-se Are your concerns are resolved? Could we unblock the FCP?
for sure
@rfcbot resolve to split pauth
or not to split pauth
I think we could deprecate the re-export by writing a deprecated wrapper macro that calls the version in
std::arch
.
Yeah, good idea. That should work.
That said, I also think these names will never conflict with anything, and exporting them will not do any harm
I'm not worried about naming conflicts. I'm concerned about adding more clutter to the front page of the standard library reference documentation (https://doc.rust-lang.org/stable/std/) and to auto-completion features in editors.
That said, I also think these names will never conflict with anything, and exporting them will not do any harm
I'm not worried about naming conflicts. I'm concerned about adding more clutter to the front page of the standard library reference documentation (https://doc.rust-lang.org/stable/std/) and to auto-completion features in editors.
I agree with this. It's unfortunate that one macro in this series was stabilized long before macro namespacing was possible, but it's possible to avoid exacerbating this annoyance, as well as avoid setting a precedent for future macros in this series. As mentioned, it's possible to deprecate is_x86_feature_detected
by defining a wrapper macro, and deprecating this would give IDE tooling authors cover to avoid autocompleting it; consider today that autocompleting i
or std::i
would suggest is_x86_feature_detected
before suggesting Iterator
or std::iter
. And considering that the arch
module already exists, it's not difficult to figure out where these should be naturally defined instead.
@rfcbot resolve move to a module?
This is resolved by #93414.
:bell: This is now entering its final comment period, as per the review above. :bell:
The final comment period, with a disposition to merge, as per the review above, is now complete.
As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.
This will be merged soon.
Sorry for the silence on this issue for a while. The implementations should be mostly up to date now. I'll summarise my open PRs for this in one place:
Runtime detection macro:
Target feature:
aarch64_target_feature
. Needs a small update depending on what we do with this:See also https://github.com/rust-lang/rust/issues/90620 for the aarch64_target_feature
stabilisation issue.
Is there anything else needed for this issue, or can it be closed?
https://github.com/rust-lang/rust/pull/90271#issuecomment-1038400916 @ehuss continuing discussion here.
Looks like the error macro (in stdarch/crates/std_detect/src/detect/error_macros.rs
) is still marked as unstable under stdsimd
.
The same is also true for the x86 version - so you get an unstable warning when using is_x86_feature_detected!
on aarch64 platforms, for instance.
Both should clearly be fixed I think - I'll have a patch up shortly.
@ehuss also mentioned that the docs still show the error_macro docs for none-x86 platforms. Yet to find the reason but everything's fine with cargo doc
in stdarch, but not with ./x.py doc
.
Maybe some issue due to std_detect being a separate crate from std which causes #[cfg(doc)]
to not work correctly?
When a dependency is built for rustdoc, cfg(doc)
is not set. So the re-export is just re-exporting the normal macros, not the cfg(doc)
ones.
That makes sense https://github.com/rust-lang/cargo/issues/8811
So, with beta cutoff next week, we could:
[package.metadata.docs.rs]
in std_detect's Cargo.toml. Does that docs.rs section also affect doc.rust-lang? Is there another (presumably unused as I couldn't find any) section for doc.rust-lang.org?aarch64.rs
to the error macro. It will look the same with the exception of the src
link, and the list of cases in the macro (which are duplicates of what's in the doc comments anyway.#[doc(hidden)]
) regardless of architecture so we can put some #[cfg(doc)]
logic in library/std/src/lib.rs
? Potentially through a wrapper macro or reexporting under a different name/module.The third option seems to be more invasive so I'm going to go with the second for now until someone says otherwise.
cutoff next week
Cutoff is Friday morning this week.
Does that docs.rs section also affect doc.rust-lang?
It does not.
Adding comments to the error macros might help a little, but I imagine will still show up as unstable so it might still be confusing.
Unfortunately I don't have much time to help here. It's a bit of a thorny issue that can be tricky to fix.
I opened https://github.com/rust-lang/stdarch/pull/1283 which should fix this.
Following up on this: what are the remaining blockers to merge this?
I believe all of the concerns have been addressed so this is just waiting on a stabilization PR.
This issue constitutes the stabilisation report as per the rustc dev guide
Summary
I'm hoping to stabilise std::is_aarch64_feature_detected which allows for runtime feature detection on aarch64 platforms. It was recently updated to include all features supported by both linux and LLVM 12, though the detection for some features isn't on non-linux platforms yet.
Documentation
The documentation for it is in this file. There's doc comments for individual features in the macro but I don't see these exposed anywhere, just on a hidden enum. Perhaps this could be improved. I've also got a small patch to mention it in the Rust Reference here.
Tests
There are tests for it in stdarch as well as rust. These are currently all up to date.
Unresolved questions
The only backwards-compatible question left to resolve is whether to split
pauth
intopaca
(address authentication) andpacg
(generic authentication) as linux does. The current behaviour (together) matches LLVM (and the ARMv8-A feature FEAT_PAuth).The x86 macro was stabilised under the
simd_x86
feature - does this mean I should create a newsimd_aarch64
feature to stabilise this and other aarch64 stdsimd work under?See also: https://github.com/rust-lang/rust/issues/48556 for the tracking issue for the stdsimd feature. https://github.com/rust-lang/rust/issues/90620 for stabilising the
target_feature
attribute on aarch64 platforms.