Open vfdff opened 1 year ago
@llvm/issue-subscribers-backend-aarch64
This looks like a backend issue and nothing LV can improve on.
Interestingly, if the IR generated by clang (-emit-llvm
) is passed directly to llc
, we get the desired results: https://llvm.godbolt.org/z/T9PdbaPTj
Data point: looks nicer with -lsr-preferred-addressing-mode=postindexed
https://gcc.godbolt.org/z/7dK1rxv4j
This looks like a backend issue and nothing LV can improve on.
Interestingly, if the IR generated by clang (
-emit-llvm
) is passed directly tollc
, we get the desired results: https://llvm.godbolt.org/z/T9PdbaPTj
The different begin with the IR Dump After Loop Strength Reduction (loop-reduce)
method1: the IR generated by clang (-emit-llvm) is passed directly to llc
vector.body: ; preds = %vector.body, %entry %lsr.iv = phi i64 [ %lsr.iv.next, %vector.body ], [ 0, %entry ], !dbg !53 %scevgep9 = getelementptr i8, ptr @b, i64 %lsr.iv, !dbg !55 %wide.load = load <2 x double>, ptr %scevgep9, align 8, !dbg !55, !tbaa !57
method2: the IR directly generated by clang
vector.body: ; preds = %vector.body, %entry %lsr.iv = phi i64 [ %lsr.iv.next, %vector.body ], [ -8192, %entry ], !dbg !53 %scevgep = getelementptr i8, ptr @b, i64 %lsr.iv, !dbg !55 %scevgep43 = getelementptr i8, ptr %scevgep, i64 8192, !dbg !55 %wide.load = load <2 x double>, ptr %scevgep43, align 8, !dbg !55, !tbaa !57
Further, I found that the order of user passed to llc is reversed (can be dumped with -debug-only=iv-users), but I don't know how to further track the order of I->uses()?
Test: https://gcc.godbolt.org/z/3c8f3caKo
Clang: base address are reused with
x0
andx1
GCC: only need one single
add x0, x0, 16
to update all the memory address