Open zhongjuzhe opened 1 year ago
That's highly depended on the uarch, so I would prefer just tie to -mtune
like other cost model for GCC, but I think it's harmless to just add that in GCC first to see if that's useful, then implement to LLVM and then document that option here.
Personally I would prefer do not document those optimization option in this repo since those flags are compiler-dependent, and just document for necessary common interface here like -march
, -mabi
and -mcmodel
here.
Agreed. This is going to be dependent on multiple features of the uarch.
So I think the question is whether or not any such implementations exist or will exist in the near future. If not, then let's not complicate things right now. If it looks like such architectures are on the horizon, then we might as well be prepared for them.
I don't think this will affect Veyron V2.
Consider this following case:
https://godbolt.org/z/oTWvrsGhE
GCC by default enable VTYPE && POLICY fusion of vsetvli as long as they are compatible:
I believe most of the cases, that GCC codegen is better.
However, for some vendor RVV CPU which has vector register renaming && vsetvli special optimization (vsetvli execution latency almost consume 0 cycle most of the time), I believe this following codegen is better:
I think fusing VTYPE is always optimal, for example: https://godbolt.org/z/dfx93jzrv
code:
optimal codegen:
However, Policy fusion is not always the optimal, Is it resonable adding such compile option (-mprefer-agnostic) to disable tail Policy && mask policy fusion in vsetvli ?
Thanks