ARM-software / acle

Arm C Language Extensions (ACLE)
Other
90 stars 59 forks source link

[FMV] Remove features without compiler support and consolidate others. #329

Closed labrinea closed 2 months ago

labrinea commented 4 months ago

It was raised in #315 that some features are of no use. In this revision I am removing FMV features that have no equivalent backend feature in the compiler therefore cannot be supported in any meaningful way [1], and I am fusing FMV features that the compiler cannot support independently [2].

[1]: dgh, rpres, sve-bf16, sve-ebf16, sve-i8mm, memtag3 [2]: {sha1, sha2}, {aes, pmull}, {sve2-aes, sve2-pmull128}, {memtag, memtag2}, {ssbs, ssbs2}, {ls64, ls64_v, ls64_accdata}


name: Pull request about: Technical issues, document format problems, bugs in scripts or feature proposal.


Thank you for submitting a pull request!

If this PR is about a bugfix:

Please use the bugfix label and make sure to go through the checklist below.

If this PR is about a proposal:

We are looking forward to evaluate your proposal, and if possible to make it part of the Arm C Language Extension (ACLE) specifications.

We would like to encourage you reading through the contribution guidelines, in particular the section on submitting a proposal.

Please use the proposal label.

As for any pull request, please make sure to go through the below checklist.

Checklist: (mark with X those which apply)

labrinea commented 4 months ago

Please review @DanielKristofKiss @rsandifo-arm

andrewcarlotti commented 4 months ago

Thanks for picking this up. It looks mostly good to me, but is there any reason you've retained dit and ebf16 in this PR?

labrinea commented 4 months ago

Thanks for picking this up. It looks mostly good to me, but is there any reason you've retained dit and ebf16 in this PR?

Good catch on ebf16. I simply missed it. I'll fix in new revision. Regarding dit, LLVM seems to have an equivalent backend feature therefore I think it's worth keeping in the short/medium term. We can revisit later if it is of no use. Do you agree?

andrewcarlotti commented 4 months ago

There's no -march flag for +dit, and my previous objections to it still apply, so I still think it should be removed. If you'd rather leave it for now, then we could have that as a separate incremental PR.

tmatheson-arm commented 4 months ago

Wasn't the fundamental criticism in #315 that FMV is about detection, not codegen? And therefore just because there is no codegen effect of an FMV feature doesn't mean it is unusable.

For example, combining sha1 + sha2: is there never any context in which you want to discriminate between them for detection?

I thought mentag/memtag2 was already identified as a necessary distinction?

I think this is more likely to progress if it is split into several smaller patches, in which the rationale for combining/deleting each set of features is made clear.

I am removing FMV features that have no equivalent backend feature in the compiler

Just because the LLVM backend happens to combine features for codegen, that is not a justification for combining them at the FMV level. I would expect the rationale to be based on the Arm ARM or the feature sets of existing processors.

DanielKristofKiss commented 4 months ago

Originally all features that available in user-space were added to the FMV spec.

Especially with target_version the expectation is that developers could write code for a certain feature with intrinsics or even inline assembly with .inst so no need for backend support. Also target_version could be used for functionally control some aspect of the application instead of the codegen benefits.

Today there is code out there in the wild that use feature detection(HWCAP) to pick up a feature only when an unrelated and actually not used other feature is present as a kind a uarch version detection. This is bit of abuse of the feature detections but wanted to add here for awareness. So with target_clones maybe we need to keep considering features that might not end up in instructions.

I'd prefer to keep as many feature in the list as many observable in userspace and remove stuff if we can be sure there is no real chance to use them in some way.

I think this is more likely to progress if it is split into several smaller patches, in which the rationale for combining/deleting each set of features is made clear.

I second this approach.

Wilco1 commented 4 months ago

I disagree - https://github.com/ARM-software/acle/pull/315 lists good reasons to merge/remove various features. I believe it is better to start with a small set of well defined features and expand over time as required. The ACLE currently lists 60 FMV features, but GCC and LLVM implement a subset of these - AFAIK which will work and which do not is not documented anywhere...

Note FMV is not the best nor the most efficient for feature detection. Defining a global variable that holds the HWCAP values may be a better solution if that is all that is needed.

labrinea commented 4 months ago

Let me explain how the landscape looks like in LLVM and why I am proposing this change. Currently the vehicle for propagating architectural features from Clang to LLVM is the 'target-features' metadata. This is common between the command line and the target attribute, allowing them to interoperate. The FMV features I am suggesting to remove do not have corresponding metadata, making the interoperability impossible. Here is a motivating example why we need it: https://github.com/llvm/llvm-project/pull/87939. A workaround would be to add dummy metadata (without backend correspondence) for some of those FMV features. The features I am mostly concerned about are those with different meanings in FMV than the command line or the target attribute. Those are: aes, sve2-aes, memtag, ssbs, ls64. Perhaps we could start by aligning these?

labrinea commented 4 months ago

I think this is more likely to progress if it is split into several smaller patches, in which the rationale for combining/deleting each set of features is made clear.

Okay here is what is required in LLVM for splitting sha1 from sha2: https://github.com/llvm/llvm-project/pull/99816. If we decide to move this forward then GCC would have to implement the same behavior. Unfortunately we cannot do the same for the features I listed in my previous message (aes, sve2-aes, memtag, ssbs, ls64) because we would have to change their semantics, which breaks backwards compatibility. Let me explain: According to ArmARM:

The compiler is using the name of the dependee to represent the aggregated feature set instead of using the name of the dependant: for example the name ls64 means {feat_ls64_accdata+feat_ls64_v+feat_ls64}. If the name ls64_accdata had been used instead, then we would be able to do what I've shown for feat_sha1. Same goes for the rest of the features I have listed. Do you think it's ok to drop backwards compatibility and spit all these features? Or do you prefer to proceed with this ACLE proposal? Neither is great but we have to choose a way forwards.

*Note that aes interacts weirdly with crypto so changing its semantics may turnout complicated.

ktkachov commented 4 months ago

Features like sve-bf16 have intrinsics that compilers support, does that not count as "compiler support"? Also, sve-i8mm includes the SUDOT and USDOT instructions that have both intrinsics and can (and are) automatically code generated from normal C code

labrinea commented 4 months ago

Features like sve-bf16 have intrinsics that compilers support, does that not count as "compiler support"? Also, sve-i8mm includes the SUDOT and USDOT instructions that have both intrinsics and can (and are) automatically code generated from normal C code

Hey Kyril, thanks for looking at this. In LLVM both the sve-bf16 and sve-i8mm instructions are predicated on finer grained features like sve(2/p1), sme(2), bf16 and i8mm.

labrinea commented 4 months ago

@jroelofs what is your opinion on this proposal? Are there any particular features that you would like to keep from those listed in this PR? For more details please refer to the original PR from Andrew here.

jroelofs commented 4 months ago

I don't find the argument that LLVM lumps features together to be a compelling reason to lump them together on the FMV side of things. We can implement them as if they were lumped together, where that makes sense (via an alias table or something). But IMO, users should be able to mention the name of the feature they're actually depending on, to reduce confusion on the part of people reading code that uses these attributes.

On the axis of whether FMV should support detection of features that do not affect codegen in the compiler, I also have a similarly un-compelled outlook: the point of FMV is not just to toggle compiler-aware features, but instead to give users a consistent and easy-to-reason-about way to write isa-feature-aware versioned functions.

Together, this is why I strongly supported @labrinea's patch to stop optimizing FMV resolvers in terms of implied features. The way LLVM lumps features together for those "implies" relationships at the moment is a bit of an ad-hoc mess which we periodically have to clean up.

labrinea commented 4 months ago

We can implement them as if they were lumped together, where that makes sense (via an alias table or something).

Okay before I start raising separate PRs, in the case of sha1 and sha2 ArmARM says:

SHA2, bits [15:12] 0b0000 No SHA2 instructions implemented. FEAT_SHA256 implements the functionality identified by the value 0b0001 . If the value of ID_AA64ISAR0_EL1.SHA1 is 0b0000 , this field must have the value 0b0000 .

SHA1, bits [11:8] 0b0000 No SHA1 instructions implemented. FEAT_SHA1 implements the functionality identified by the value 0b0001 . If the value of ID_AA64ISAR0_EL1.SHA2 is 0b0000 , this field must have the value 0b0000 .

To my understanding this means you can't have one without the other. Would we like to fuse them in the FMV spec or introduce an alias in the compiler sha1 -> sha2 ? I think the draft should be abandoned based on the above information.

labrinea commented 2 months ago

We are addressing the listed features case by case.