Open Quuxplusone opened 3 years ago
Bugzilla Link | PR48016 |
Status | NEW |
Importance | P enhancement |
Reported by | Craig Topper (craig.topper@gmail.com) |
Reported on | 2020-10-29 16:35:59 -0700 |
Last modified on | 2021-05-29 13:38:19 -0700 |
Version | unspecified |
Hardware | PC All |
CC | cullen.rhodes@arm.com, evandro.menezes@sifive.com, llvm-bugs@lists.llvm.org, matdzb@gmail.com, neeilans@live.com, richard-llvm@metafoo.co.uk |
Fixed by commit(s) | |
Attachments | |
Blocks | |
Blocked by | |
See also |
Thanks for reporting this Craig.
I spent some time looking into it today and found the problem. The issue is the
CodeGen for casting between vector-length-agnostic (VLA) and vector-length-
specific (VLS) SVE types. This is done through memory and implemented with
(ScalarExprEmitter::VisitCastExpr):
Address Addr = EmitLValue(E).getAddress(CGF);
Addr = Builder.CreateElementBitCast(Addr, CGF.ConvertTypeForMem(DestTy));
LValue DestLV = CGF.MakeAddrLValue(Addr, DestTy);
DestLV.setTBAAInfo(TBAAAccessInfo::getMayAliasInfo());
return EmitLoadOfLValue(DestLV, CE->getExprLoc());
Blindly calling EmitLValue seems wrong, I think we should check if the
expression is an r-value and store it to memory before bitcasting and then
loading. Although FWIW doing the bitcast through memory is a temporary
solution, we have work in the pipeline to re-implement this with insert/extract
subvector intrinsics which should improve the codegen and fix this issue.
The example you attached is defined by the ACLE but the use of binary operators
with VLS types is undefined unless __ARM_FEATURE_SVE_VECTOR_OPERATORS==1 is
set. We don't currently enable this feature macro under -msve-vector-bits since
support is ongoing, although as you say it shouldn't crash.
FWIW, it seems to work now and uses llvm.experimental.vector.insert:
-target aarch64-none-linux-gnu -march=armv8-a+sve -msve-vector-bits=128
https://clang.godbolt.org/z/vsGx8c4Y3
-target aarch64-none-linux-gnu -march=armv8-a+sve -msve-vector-bits=256
https://clang.godbolt.org/z/TdceYr7Y5
-target aarch64-none-linux-gnu -march=armv8-a+sve -msve-vector-bits=256 -mllvm -
aarch64-sve-vector-bits-min=256
https://clang.godbolt.org/z/EeYMdcKaG
(The latter two to generate SVE, otherwise resulting code uses Neon.)