llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.03k stars 11.97k forks source link

[AArch64] Fold the ushll2 + ushll into zip2 + zip1 #100117

Open vfdff opened 3 months ago

vfdff commented 3 months ago
llvmbot commented 3 months ago

@llvm/issue-subscribers-backend-aarch64

Author: Allen (vfdff)

* test: https://gcc.godbolt.org/z/TsoWzMsdx ``` void zip_noalias (int *b, unsigned short *a, int n) { #pragma clang loop vectorize(assume_safety) for (int i = 0; i< (n & -4); i ++) { b[i] = a[i]; } } ``` * gcc use zip with short latency ``` .L5: ldr q30, [x4], 16 zip1 v29.8h, v30.8h, v31.8h zip2 v30.8h, v30.8h, v31.8h stp q29, q30, [x3], 32 cmp x4, x5 ```
davemgreen commented 3 months ago

They likely do this as the zip's have a higher throughput. We already do the same for truncates (https://gcc.godbolt.org/z/hdqbGvcGY) from https://reviews.llvm.org/D115435.